首页 > 其他 > 详细

Cassandra1.2文档学习(18)—— CQL数据模型(下)

时间:2014-02-06 17:22:48      阅读:484      评论:0      收藏:0      [点我收藏+]

三、集合列

CQL 3 引入了一下集合类型:

?set

?list

?map

  在关系型数据库中,允许用户拥有多个email地址,你可以创建一个email_addresses表与users表存在一个多对一关系。CQL 3能够处理经典的多个电子邮件地址的使用情况,其他使用的情况下可以通过定义的列达到集合的效果。利用集合来解决多个邮件地址的问题是方便和直观的。

  集合类型的另一个用途被证明可以使用音乐服务作为例子。

 

四、往表中增加一个集合

  上面说的音乐服务的例子包括给歌曲标记标签。从一个关系的角度看,你可以认为把存储行引擎视为分区,包含了聚集的行。为了给歌曲打标签,使用集合类型set。定义集合通过CREATE TABLE或者ALTER TABLE语句。因为songs表已经存在在刚才的例子中了,仅仅修改表是的添加一个集合set——tags:

 

ALTER TABLE songs ADD tags set<text>;

 

五、更新一个集合

更新songs 表插入tags 数据:

 

UPDATE songs SET tags = tags + {‘2007‘}

  WHERE id = 8a172618-b121-4136-bb10-f665cfc469eb;

UPDATE songs SET tags = tags + {‘covers‘}

  WHERE id = 8a172618-b121-4136-bb10-f665cfc469eb;

UPDATE songs SET tags = tags + {‘1973‘}

  WHERE id = a3e64f8f-bd44-4f28-b8d9-6938726e34d4;

UPDATE songs SET tags = tags + {‘blues‘}

  WHERE id = a3e64f8f-bd44-4f28-b8d9-6938726e34d4;

UPDATE songs SET tags = tags + {‘rock‘}

  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

 

一个音乐评论列表和演出时间表(mao集合)可以添加到表:

 

ALTER TABLE songs ADD reviews list<text>;

ALTER TABLE songs ADD venue map<timestamp, text>;

 

map的每一个元素,list或者map内部存储为Cassandra的一列。为了更新一个集合,使用UPDATE命令和加 (+) 操作符去增加一个元素和减 (-) 操作符去移除一个元素。例如,更新一个集合:

 

UPDATE songs

  SET tags = tags + {‘rock‘}

  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

 

更新一个列表,使用方括号代替花括号,其他语法相似。

 

UPDATE songs

  SET reviews = reviews + [ ‘hot dance music‘ ]

  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

 

更新一个map,使用INSERT :

 

INSERT INTO songs (id, venue)

  VALUES (7db1a490-5878-11e2-bcfd-0800200c9a66,

  { ‘2013-9-22 12:01‘  : ‘The Fillmore‘,

  ‘2013-10-1 18:00‘ : ‘The Apple Barrel‘});

插入数据到map中会替换整个map。

 

六、查询一个集合

  去查询一个集合,在select表达式中包含集合列的名字。

 

SELECT id, tags FROM songs;

SELECT id, venue FROM songs;

 

When to use a collection

Use collections when you want to store or denormalize a small amount of data. Values of items in collections are limited to 64K. Other limitations also apply. Collections work well for storing data such as the phone numbers of a user and labels applied to an email. If the data you need to store has unbounded growth potential, such as all the messages sent by a user or events registered by a sensor, do not use collections. Instead, use a table having a compound primary key and store data in the clustering columns.

 

Expiring columns

Data in a column can have an optional expiration date called TTL (time to live). Whenever a column is inserted, the client request can specify an optional TTL value, defined in seconds, for the data in the column. TTL columns are marked as having the data deleted (with a tombstone) after the requested amount of time has expired. After columns are marked with a tombstone, they are automatically removed during the normal compaction (defined by the gc_grace_seconds) and repair processes. For information about gc_grace_seconds, see gc_grace in Keyspace and table storage configuration.

Use CQL to set the TTL for a column.

If you want to change the TTL of an expiring column, you have to re-insert the column with a new TTL. In Cassandra, the insertion of a column is actually an insertion or update operation, depending on whether or not a previous version of the column exists. This means that to update the TTL for a column with an unknown value, you have to read the column and then re-insert it with the new TTL value.

TTL columns have a precision of one second, as calculated on the server. Therefore, a very small TTL probably does not make much sense. Moreover, the clocks on the servers should be synchronized; otherwise reduced precision could be observed because the expiration time is computed on the primary host that receives the initial insertion but is then interpreted by other hosts on the cluster.

An expiring column has an additional overhead of 8 bytes in memory and on disk (to record the TTL and expiration time) compared to standard columns.

 

Counter columns

A counter is a special kind of column used to store a number that incrementally counts the occurrences of a particular event or process. For example, you might use a counter column to count the number of times a page is viewed.

Counter column tables must use Counter data type. Counters may only be stored in dedicated tables.

After a counter is defined, the client application then updates the counter column value by incrementing (or decrementing) it. A client update to a counter column passes the name of the counter and the increment (or decrement) value; no timestamp is required.

Internally, the structure of a counter column is a bit more complex. Cassandra tracks the distributed state of the counter as well as a server-generated timestamp upon deletion of a counter column. For this reason, it is important that all nodes in your cluster have their clocks synchronized using a source such as network time protocol (NTP).

Unlike normal columns, a write to a counter requires a read in the background to ensure that distributed counter values remain consistent across replicas. Typically, you use a consistency level of ONE with counters because during a write operation, the implicit read does not impact write latency.

 

Using natural or surrogate primary keys

One consideration is whether to use surrogate or natural keys for a table. A surrogate key is a generated key (such as a UUID) that uniquely identifies a row, but has no relation to the actual data in the row.

For some tables, the data may contain values that are guaranteed to be unique and are not typically updated after a row is created. For example, the user name in a users table. This is called a natural key. Natural keys make the data more readable and remove the need for additional indexes or denormalization. However, unless your client application ensures uniqueness, it could potentially overwrite column data.

Cassandra1.2文档学习(18)—— CQL数据模型(下)

原文:http://www.cnblogs.com/dyf6372/p/3538775.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!