Skip to content

A Software Engineer's Journal

A life spent making mistakes is not only more honourable, but more useful than a life spent doing nothing – George Bernard Shaw

Primary Menu
  • Blog
  • My Website

PySpark

Running PySpark with Cassandra using spark-cassandra-connector in Jupyter Notebook

Posted on September 6, 2018April 4, 2023 by tankala

We are facing several out of memory issues when we are doing operations on big data which present in our DB Cassandra cluster. So we decided its better to use Spark to solve this problem. It became a tough & Continue Reading

Posted In Cassandra, Databases, PySpark, SparkTagged In Cassandra, Jupyter Notebook, PySpark, Spark

Run your first Spark program using PySpark and Jupyter notebook

Posted on September 2, 2018April 4, 2023 by tankala

I think almost all whoever have a relationship with Big Data will cross Spark path in one way or another way. I know one day I need to go for a date with Spark but somehow I was postponing for Continue Reading

Posted In PySpark, SparkTagged In Jupyter Notebook, PySpark, Spark

Recent Posts

  • Striking the Right Balance Between Uniform Solutions and Over-Engineering
  • Exploring the Role of Code Understanding in the LLM Era – A Tech Consultant’s Perspective
  • My observations of the Python ecosystem
  • Debugging Flask application within a Docker container using VSCode
  • How to write your own Redis key expire listener in Python

Recent Comments

  • tankala on Profiling Node.js application using v8-profiler
  • Sam on Profiling Node.js application using v8-profiler
  • tankala on Profiling Node.js application using v8-profiler
  • Sam on Profiling Node.js application using v8-profiler
  • JS on MONITOR NGINX WITH TELEGRAF, INFLUXDB, AND GRAFANA

Archives

  • November 2023
  • October 2023
  • May 2022
  • August 2021
  • June 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • July 2019
  • February 2019
  • January 2019
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018

Categories

  • Artificial Intelligence
  • Cassandra
  • Data Science
  • Databases
  • Deep Learning
  • DevOps
  • Docker
  • Flask
  • Framework
  • Java
  • Machine Learning
  • Node.js
  • Pandas
  • PySpark
  • Python
  • Serverless
  • Software Languages
  • Spark
  • Travel
  • Uncategorized

Recent Posts

  • Striking the Right Balance Between Uniform Solutions and Over-Engineering
  • Exploring the Role of Code Understanding in the LLM Era – A Tech Consultant’s Perspective
  • My observations of the Python ecosystem
  • Debugging Flask application within a Docker container using VSCode
  • How to write your own Redis key expire listener in Python

Tag Cloud

Async CareerAdvancement Cassandra CNN convert from json to csv Convolutional neural network Coorg debug Deep Learning Docker ExpressJs Flask Grafana InfluxDB Java JavaScript json to csv json to csv converter Jupyter Notebook Kafka Keras Machine Learning Monitoring NodeJs OpenAI pandas pandas cheatsheet pandas groupby pandas merge pandas python pandas read_csv Performance improvement pip trends PySpark Python python ecosystem Redis redis key value expire Serverless Spark TechConsulting Telegraf TensorFlow TestDrivenDevelopment Unplugged
© All rights reserved | Blog by Tankala.