Right now around 100 Gigabytes of data gets generated related to user activities at our company’s different applications/products. The product which I am working on basically consumes all of that and gives insights about users behavior and also helps to Continue Reading
Running PySpark with Cassandra using spark-cassandra-connector in Jupyter Notebook
We are facing several out of memory issues when we are doing operations on big data which present in our DB Cassandra cluster. So we decided its better to use Spark to solve this problem. It became a tough & Continue Reading
Run your first Spark program using PySpark and Jupyter notebook
I think almost all whoever have a relationship with Big Data will cross Spark path in one way or another way. I know one day I need to go for a date with Spark but somehow I was postponing for Continue Reading