Basilica is an API that embeds high-dimensional data like images and text. You send us e.g. an image, and we send you back a vector of floats. You can feed these features into traditional ML algorithms like linear regression or k-means clustering.
A common E-Commerce problem is showing customers similar products. If you have pictures of these products, Basilica makes it easy to create a similarity score between these pictures, based on the visual features that actually matter to customers.
Basilica makes it easy to integrate high-dimensional data into real-time financial models. For example, you could use the content of recent tweets about a company to predict how its stock price will move.
Abuse is rampant in online communities. Basilica makes it easy to analyze text and images to find fake accounts, identify inappropriate images, or detect fake profile data.
Basilica lets you easily cluster job candidates by the text of their resumes. A number of additional features for this category are on our roadmap, including a source code embedding that will let you cluster candidates by what kind of code they write.
Basilica lets you easily integrate asset pictures into a pricing model for those assets. You could, for example, integrate pictures of the inside of a home into an underwriting model, or pictures of a used car into a bidding algorithm.
Basilica uses a technique called “embedding” to transform high-dimensional data into usable features. It’s similar to other embeddings you may be familiar with, like word2vec.
We produce these embeddings by training deep neural networks to perform a mix of tasks on private and public data. The intermediate layers of these networks learn to recognize general features in the data, which we send back to you as a vector of floats.
Using these embeddings lets you integrate high-dimensional data into existing models with very little effort. Embeddings are also a form of transfer learning -- because we train the embeddings on millions of data points, you can get very good results even if your high-dimensional dataset is very small.Read more in our FAQ