Rakam is an analytics platform that allows you to create your analytics services.
## Features / Goals
Rakam is a modular analytics platform that gives you a set of features to create your own analytics service.
Typical workflow of using Rakam:
Collect data from multiple sources with **[trackers, client libraries, webhooks, tasks etc.](🔗)**
Enrich and sanitize your event data with **[event-mappers](🔗)**
Store data in a data warehouse to analyze it later. (Postgresql, HDFS, S3 etc.)
Analyze your event data with your SQL queries and integrated rich analytics APIs (**[funnel, retention, real-time reports](🔗)**, **[event streams](🔗)**
Create custom dashboards and real-time reports using **[Rakam BI](🔗)**
**[Develop your own modules](🔗)** for Rakam to customize it for your needs.
All these features come with a single box, you just need to specify which modules you want to use using a configuration file (config.properties) and Rakam will do the rest for you. We also provide cloud deployment tools for scaling your Rakam cluster easily.
If your event data-set can fit in a single server, we recommend using Postgresql backend. Rakam will collect all your events in the row-oriented format in a Postgresql node. All the features provided by Rakam are supported in Postgresql deployment type.
However, Rakam is designed to be highly scalable in order to provide a solution for high work-loads. You can configure Rakam to send events to a distributed commit-log such as Apache Kafka or Amazon Kinesis in serialized Apache Avro format and process data in PrestoDB workers and store them in a distributed filesystem in a columnar format.
An event is an immutable action defined in your specific use-case. They have a collection name and properties. If you have a website, all page views are events. You may use `
pageview` as collection name and URL, client location data etc. are properties. Or if you are an IoT company, all the sensor data coming from your devices is an event. Events should have an actor and timestamp of the occurrence. Rakam provides multiple methods for collecting events. It has high-level methods such as trackers, client libraries, importers and also <sub>[RESTFul API](🔗)*</sub>, webhook support, schedulers for collecting events from multiple sources. You can embed trackers in your client applications, send events from your favorite programming languages via client libraries or directly using RESTFul API, integrate your third-party services via webhook or schedulers. We aim to make Rakam as your data hub so you should be able to collect data from everywhere to Rakam.
You don't need to define the collection schema, Rakam automatically handles schema evolution of the events for you. One caveat is that if you have a property called _IP_ and the type is _integer_, it tries to cast _string_ to _integer_ and if it fails the field _is_ will be ignored.
Rakam will check the fields and if they exist and values match the existing schema, the event will be sent to the storage backend as you sent.
# RESTFul API
If you want to send events directly from your applications, you can use client libraries. They're basically a wrapper that uses <sub>[RESTFul API](🔗)*</sub>.
Client libraries are current is in beta, they're automatically generated with Swagger. They provide classes and methods for all API endpoints in your Rakam API, you can also directly use [RESTFul API](🔗) to send data to Rakam.
<sub>[RESTFul API](🔗)*</sub> powers trackers and client libraries. You can always use the directly. Here is an example event:
_time` attribute is not set, Rakam automatically attaches the current timestamp to the event. If you know the user who did the event, you should also set `
_user` attribute to user id. These attributes will be used by funnel, retention and event explorer modules.
If you want to disable schema evolution for security reasons, you can use _disable_dynamic_schema=true_ config once you created the schema of your event collections. It's recommended if you are running Rakam in production and open to the clients.
If you want to integrate third-party services with Rakam and the third party service supports Webhook, you may use our webhook support. For example, Stripe sends payment and order data, Mailgun sends mail data (unsubscribe, click, open, etc.) via webhook. Webhook support is implemented as follow: You define an identifier such as `
mailgun_mails` and write JS code that transforms the request body and headers and builds the event using JS code. When we receive a request from `
[RAKAM_API]/webhook/collect/mailgun_mails`, we invoke the JS code if it returns JSON data, we add it as a new event.
Webhook support is available in Rakam BI and we provide templates for common services such as Mailgun and Stripe.
You can import CSV, JSON or AVRO files directly to your Rakam project. If you're already using analytics services such as Mixpanel and want to import to Rakam, we have importers that fetches raw event data from them and send it to Rakam. Currently, we have Mixpanel integration, you can find the documentation [here](🔗). We also have [twitter task](🔗) that collects tweet data from Twitter in real-time and send it to Rakam continuously, we use it internally for our integration tests.