Using Collection in Cassandra

Category: Cassandra   Tags: Cassandra, Learning, Beginners, Basics, NoSQL Database, Cassandra CQL, Collection in Cassandra

Cassandra provides collection types as a way to group and store related type of data together in a column. Use collection if the data to be stored in collection is limited. If the data has unbounded growth use a table with a compound primary key where data is stored in the clustering columns. Supported collections are:

  • list
  • set
  • map

Using the set type

Create the table having set column type:

CREATE TABLE users (
  id text PRIMARY KEY,
  name text,
  emails set<text>
);
                        

Insert the data into the table:

INSERT INTO users (id, name, emails)
  VALUES('123', 'Ajay', {'ajay1@gmail.com', 'ajay2@gmail.com'});
                        

Select records from table:

SELECT id, emails FROM users WHERE id = '123';
                        

It will return below result:

 id | emails
---------+------------------------------------------
 ajay   | {'ajay1@gmail.com', 'ajay2@gmail.com'}
                        

You can add more emails:

UPDATE users
  SET emails = emails + {'ajay3@gmail.com'} WHERE id = '123';
                        

You can delete an email:

UPDATE users
  SET emails = emails - {'ajay3@gmail.com'} WHERE id = '123';
                        

Using the list type

When the order of elements matters, use list type. A list will store the elements in the same order it was added to the list. It can have duplicate values. For example in the users table we will add a new column place_visited:

ALTER TABLE users ADD place_visited list<text>;
                        

Add some data:

UPDATE users
  SET place_visited = [ 'agra', 'delhi' ] WHERE id = '123';
                        

You can append and prepend the data to the list:

UPDATE users
  SET place_visited = place_visited + [ 'mumbai' ] WHERE id = '123';

UPDATE users
  SET place_visited = [ 'kolkata' ] + place_visited WHERE id = '123';
                        

You can add element at particular position:

UPDATE users SET place_visited[2] = 'jaipur' WHERE id = '123';
                        

When you add an element at a particular position, Cassandra reads the entire list and then writes only the updated element. It results in greater latency than appending or prefixing an element to a list.

Delete element at particular position:

DELETE place_visited[3] FROM users WHERE id = '123';
                        

Using the Map Type

A map stores data in key-value form. Each data will have a key and its value. If you have the key, you can get its value quickly. Keys are unique. Each element can have an individual time-to-live and expire when the TTL ends. Each element of the map is internally stored as one Cassandra column. We will add new column todo in the users table:

ALTER TABLE users ADD todo map<timestamp, text>;
                        

Add some values to the user's todo map:

UPDATE users
  SET todo =
  { '2016-1-2 17:00' : 'Buy icecream',
  '2016-1-2 12:00' : 'Call seller' }
  WHERE id = '123';
                        

Update the user's todo map:

UPDATE users SET todo['2016-1-2 12:00'] = 'call seller2'
  WHERE id = '123';
                        

Delete an element from map:

DELETE todo['2016-1-2 12:00'] FROM users WHERE id = '123';
                        

See the map using SELECT command:

SELECT id, todo FROM users WHERE id = '123';
                        

Limitation

  • The maximum number of keys for a map collection is 65,535.
  • The maximum size of an item in a set collection is 65,535 bytes.
  • Cassandra can query upto 2 billion items in a collection. So do not insert more then that many elements.
  • The maximum size of an item in a list or a map collection is 2GB.
  • Cassandra reads a collection in its entirety so keep collections small to prevent delays during querying. The collection is not paged internally.