DataStax Developer Blog

Get the latest developer news and updates! The DataStax Developer blog is a great resource to keep up to date on the latest!

Subscribe for weekly updates!

Subscribe to RSS feed

E.g., 06/25/2019
E.g., 06/25/2019
Jun 07, 2019 • By: Brian Hess

In the previous 3 blog posts (herehere, and here), we covered some loading examples, and covered some of the common options, such as logging and connection details.  In this blog post, we will turn our attention to unloading.

May 31, 2019 • By: Adron Hall

Some notes along with this talk. Which is about ways to mitigate super nodes, partitioning strategies, and related efforts. Jonathan’s talk is vendor neutral, even though he works at DataStax. Albeit that’s not odd to me, since that’s how we roll at DataStax anyway. We take pride in working with DSE but also with knowing the various products out there, as things are, we’re all database nerds after all. (more below video)

 

May 14, 2019 • By: Scott Hendrickson

We recently announced availability of DataStax Distribution of Apache Cassandra on Microsoft Azure aimed at helping developers take advantage of a simpler, more cost-effective way to build and scale applications. We wanted to take our experience with large scale deployments on Azure for over 5 years and deliver an environment with best practices built-in so you can focus on your business. With that in mind, let’s jump into what you need to know to build an app.

May 13, 2019 • By: Russell Spitzer

Apache Spark! Darling of the Big Data world and the easiest entry point into Machine Learning. It's fast, it's cool, and it's hip. So many great things, but should you be using it in your stack?

May 10, 2019 • By: Jeff Carpenter

Over the first few posts of this series, I’ve been sharing about my experience building a Python implementation of the KillrVideo microservice tier. In the previous posts I shared why I started this project, about building GRPC service stubs, advertising the endpoints in etcd, and setting up integration tests to exercise the service APIs.

May 08, 2019 • By: Jeff Carpenter

After 2+ years of hard work from the community, Apache Cassandra 4.0 is in the final stages of testing for official release. As the biggest gathering of Cassandra professionals since the last major release (3.0), DataStax Accelerate represents a great opportunity to get caught up on what’s been happening in the project - straight from the committers and experts implementing these changes and the power users who are pushing the limits of this highly scalable database even further.

Apr 30, 2019 • By: Brian Hess

This is the third blog post about dsbulk.  The first two blog posts (here and here) covered some basic loading examples.  This post will delve into some of the common options to load, unloading, and counting.

Apr 22, 2019 • By: Cristina Veale, Eric Zietlow

As developers, knowledge is the main thing that holds us back.  The phrase “the more you know” has never been truer. From learning new languages and technologies to better understanding the theory around them, we must expand our knowledge base in order to stay relevant in today's industry.  There are many classes we can take and blogs we can read, but nothing ever comes close to real, hands-on experience working with new technologies. Given the overwhelming demand, bootcamps and hands-on instruction are in short supply.

Apr 17, 2019 • By: Adron Hall

If you’re interested in running data-intensive systems (think Apache Cassandra, DataStax Enterprise, Kafka, Spark, Tensorflow, Elasticsearch, Redis, etc) in Kubernetes this is a great talk. @Lenadroid covers what options are available in Kubernetes, how architectural features around pods, jobs, stateful sets, and replica sets work together to provide distributed systems capabilities. Other features she continues and delves into include custom resource definitions (CRDs), operators, and HELM Charts, which include future and peripheral feature capabilities that can help you host various complex distributed systems. I’ve included references below the video here, enjoy. 

Apr 17, 2019 • By: Amanda Moran

So you want to experiment with Apache Cassandra and Apache Spark to do some Machine Learning, awesome! But there is one downside, you need to create a cluster or ask to borrow someone else's to be able to do your experimentation… but what if I told you there is a way to install everything you need on one node, even on your laptop (if you are using Linux of Mac!). The steps outlined below will install:

  • Apache Cassandra
  • Apache Spark
  • Apache Cassandra - Apache Spark Connector
  • PySpark
  • Jupyter Notebooks
  • Cassandra  Python Driver

Note: With any set of install instructions it will not work in all cases. Each environment is different. Hopefully, this works for you (as it did for me!), but if not use this as a guide. Also, feel free to reach out and add comments on what worked for you!