DataStax Developer Blog

Get the latest developer news and updates! The DataStax Developer blog is a great resource to keep up to date on the latest!

Subscribe for weekly updates!

Subscribe to RSS feed

E.g., 06/24/2019
E.g., 06/24/2019
May 14, 2019 • By: Scott Hendrickson

We recently announced availability of DataStax Distribution of Apache Cassandra on Microsoft Azure aimed at helping developers take advantage of a simpler, more cost-effective way to build and scale applications. We wanted to take our experience with large scale deployments on Azure for over 5 years and deliver an environment with best practices built-in so you can focus on your business. With that in mind, let’s jump into what you need to know to build an app.

May 13, 2019 • By: Russell Spitzer

Apache Spark! Darling of the Big Data world and the easiest entry point into Machine Learning. It's fast, it's cool, and it's hip. So many great things, but should you be using it in your stack?

May 10, 2019 • By: Jeff Carpenter

Over the first few posts of this series, I’ve been sharing about my experience building a Python implementation of the KillrVideo microservice tier. In the previous posts I shared why I started this project, about building GRPC service stubs, advertising the endpoints in etcd, and setting up integration tests to exercise the service APIs.

May 08, 2019 • By: Jeff Carpenter

After 2+ years of hard work from the community, Apache Cassandra 4.0 is in the final stages of testing for official release. As the biggest gathering of Cassandra professionals since the last major release (3.0), DataStax Accelerate represents a great opportunity to get caught up on what’s been happening in the project - straight from the committers and experts implementing these changes and the power users who are pushing the limits of this highly scalable database even further.

Apr 30, 2019 • By: Brian Hess

This is the third blog post about dsbulk.  The first two blog posts (here and here) covered some basic loading examples.  This post will delve into some of the common options to load, unloading, and counting.

Apr 22, 2019 • By: Cristina Veale, Eric Zietlow

As developers, knowledge is the main thing that holds us back.  The phrase “the more you know” has never been truer. From learning new languages and technologies to better understanding the theory around them, we must expand our knowledge base in order to stay relevant in today's industry.  There are many classes we can take and blogs we can read, but nothing ever comes close to real, hands-on experience working with new technologies. Given the overwhelming demand, bootcamps and hands-on instruction are in short supply.

Apr 17, 2019 • By: Adron Hall

If you’re interested in running data-intensive systems (think Apache Cassandra, DataStax Enterprise, Kafka, Spark, Tensorflow, Elasticsearch, Redis, etc) in Kubernetes this is a great talk. @Lenadroid covers what options are available in Kubernetes, how architectural features around pods, jobs, stateful sets, and replica sets work together to provide distributed systems capabilities. Other features she continues and delves into include custom resource definitions (CRDs), operators, and HELM Charts, which include future and peripheral feature capabilities that can help you host various complex distributed systems. I’ve included references below the video here, enjoy. 

Apr 17, 2019 • By: Amanda Moran

So you want to experiment with Apache Cassandra and Apache Spark to do some Machine Learning, awesome! But there is one downside, you need to create a cluster or ask to borrow someone else's to be able to do your experimentation… but what if I told you there is a way to install everything you need on one node, even on your laptop (if you are using Linux of Mac!). The steps outlined below will install:

  • Apache Cassandra
  • Apache Spark
  • Apache Cassandra - Apache Spark Connector
  • PySpark
  • Jupyter Notebooks
  • Cassandra  Python Driver

Note: With any set of install instructions it will not work in all cases. Each environment is different. Hopefully, this works for you (as it did for me!), but if not use this as a guide. Also, feel free to reach out and add comments on what worked for you!

Apr 09, 2019 • By: Brian Hess

In the last blog post, we introduced the dsbulk command, some basic loading examples, and dove into some mappings.  In this blog post, we are going to look into some additional elements for data loading.

Apr 04, 2019 • By: Jeff Carpenter

As a software industry veteran I’ve s̶e̶e̶n̶ / e̶x̶p̶e̶r̶i̶e̶n̶c̶e̶d̶ / i̶n̶f̶l̶i̶c̶t̶e̶d̶ / been victimized by any number of inventive approaches to integrating and testing distributed systems, so the title of this post is a bit tongue-in-cheek. I’ve been sharing about my experience building a Python implementation of the KillrVideo microservice tier. In the previous posts, I shared how I got started on this project, about building GRPC service stubs and advertising the endpoints in etcd. This time, I’d like to elaborate about why I built this service scaffolding first before implementing any business logic.