An invitation for you, introducing NLP, and Word Vectors

Image for post
Image for post
Photo by Iñaki del Olmo on Unsplash

The AI area of Natural Language Processing, or NLP, throughout its gigantic language models — yes, GPT-3, I’m watching you — presents what it's perceived as a revolution in machines' capabilities to perform the most distinct language tasks.

Due to that, the perception of the public as a whole is split: some perceive that these new language models are going to pave the way to a Skynet type of technology, while others dismiss them as hype-fueled technologies that will live in dusty shelves, or HDD drives, in little to no time.

Invitation

Motivated by this, I’m creating this series of stories…


Create a reliable script to extract historical trade data from Bitfinex

Image for post
Image for post
Photo by Jason Briscoe on Unsplash

Note: this article is provided for entertainment and educational purposes only and is not intended as financial advice.

As an ex-CTO of a cryptocurrency exchange, I had my fair share of messing around with several major exchanges’ APIs. In this article, I will guide you through the process of creating a reliable Python script to extract historical trade data from Bitfinex.

I will spare you the algotrading/backtesting intro since you can read my previous story on this subject, which talks about how you can extract historical trade data from Binance and get right into the code. …


Deep dive into Word2vec, GloVe, and word senses

Image for post
Image for post
Photo by Kelly Sikkema on Unsplash

In the previous post, we introduced NLP. To find out word meanings with the Python programming language, we used the NLTK package and worked our way into word embeddings using the gensim package and Word2vec.

Since we only touched the Word2Vec technique from a 10,000-feet overview, we are now going to dive deeper into the training method to create a Word2vec model.

Word2vec family

The Word2vec (Mikolov et al. 2013)[1][2] is not a singular technique or algorithm. It’s actually a family of neural network architectures and optimization techniques that can produce good results learning embeddings for large datasets.

The network architectures are…


Distinguish IO-bound and CPU-bound applications in Python to write code that runs faster

Image for post
Image for post
Image from MonikaP by Pixabay

In what cases is the use of libraries for concurrent application development appropriate and can result in increased performance using Python?

Python Interpreter

To answer this question, we first need to discuss how the Python interpreter works. In the course of the story, when detailing how the interpreter works, we will be referring to CPython, which brings the reference implementation of the Python language and is also the standard and most used interpreter among developers.

CPython is the reference interpreter, created by Guido van Rossum, creator of the Python language. With the popularization of the language, other interpreters were created by the…


By developing “Custom Visuals”

Image for post
Image for post
Photo by Kelly Sikkema on Unsplash

With the growth of the amount of data available in organizations, presenting it in a clear and direct way is increasingly important. In this context, Power BI — Microsoft’s business analysis tool — has gained prominence.

Even counting on integrated components and navigation mechanisms, enough to meet most of the regular enterprise needs, the platform still stands out for its customization possibilities.

Besides being able to customize the platform’s built-in components, it is possible, with some front-end engineering skills, to develop new ones from scratch.

Developing a Power BI Custom Visual

The development process of a new component takes place through the programming of Custom Visuals


Unless you’re a fulltime autoregressive language model

Image for post
Image for post
Photo by Matt Noble on Unsplash

You’re kind of (or a lot) into tech. You get your news from Twitter, Reddit, Medium, #random in Slack, etc.

For the past two months, your feed is getting increasingly full of GPT-3 related posts. You start to get curious, and go ahead and research to see what all the fuzz about the new OpenAI autoregressive language model API is all about.

Maybe you’re curious enough to get into the nitty-gritty, like reading the paper and trying to get early access to the API to get your hands dirty.

In the meanwhile, you find out some projects that are using…


My take after using Deepnote to develop a Python course

Disclaimer: In no way I’m affiliated to Deepnote or any of it’s members.

Deepnote is a free online data science notebook, mainly focused on collaboration (the real-time, Google docs, type of collaboration) and the abstraction of all concepts the work that gets in the way of work — environment and infrastructure setup.

Image for post
Image for post

The startup recently announced that it raised a $3.8 million seed round led by Index Ventures and including angel investors like Greg Brockman and Naval Ravikant.

After reading the TC announcement I was tempted to give it a go. I signed up for the beta and stood patiently…


Create a reliable script to extract historical trade data from Binance

Image for post
Image for post
Photo by Austin Distel on Unsplash

Note: this article is provided for entertainment and educational purposes only and is not intended as financial advice.

As an ex-CTO of a cryptocurrency exchange, I had my fair share of messing around with several major exchanges’ APIs. In this article, I will guide you through the process of creating a reliable Python script to extract historical trade data from Binance.

Rationale

When backtesting a trading strategy, that is, for executing our strategy with past data and analyzing the returns and other important factors, we have to make sure that we have the appropriate kind of data to work with. The…


A no-code, no-hype technical introduction

Image for post
Image for post

1. First Things First

1.1 Blockchain: another data structure?

In computer science, data structures are defined as a collection of data, its structure, and a set of operations for handling them. Some examples are linked lists, stacks, trees, and graphs.

Blockchain has its deepest roots in data structures, but approaching it just from that angle is insufficient. The network features (that is, the use of this data structure on a set of computers connected through a network) are an equally important part of the topic.

The idealization of the blockchain was carried out aiming at a network without a central entity, where several participants, who run the same code…


Speed up your Python time series data handling scripts

Ever wondered how you can make your data analysis processes more time-efficient when dealing with large time series data sets? Arctic may be what you’re looking for.

Image for post
Image for post
Photo by Steven Lelham on Unsplash

Arctic is a database for Python designed with one thing in mind: performance. Using Mongo DB as its underlying database, it stores data efficiently, using LZ4 compression, and can query hundreds of millions of rows per second.

In addition to the performance numbers, it makes a pretty strong case with some of its features:

  • Can handle Pandas, Numpy and Python objects (via pickling);
  • Can snapshot several versions of your objects;
  • “Chunks” data for…

Thiago Candido

Founder @ candido.ai

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store