Should I always create my DynamoDB tables using hash and range primary key type? -

in docs (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/apisummary.html) states:

you can query tables primary key of hash-and-range type

and

we recommend design applications can use query operation mostly, , use scan appropriate

it's not directly stated, make best practice use hash-and-range primary keys?

edit:

answer tl;dr: use whichever primary key type makes sense data model , use secondary indexes better querying support.

references:

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/gsi.html

http://www.allthingsdistributed.com/2013/12/dynamodb-global-secondary-indexes.html

https://forums.aws.amazon.com/thread.jspa?messageid=604862

in situation use simple hash keys on dynamodb?

the choice of key use comes down use cases , data requirements particular scenario. example, if storing user session data might not make sense using range key since each record referenced guid , accessed directly no grouping requirements. in general terms once know session id specific item querying key. example storing user account or profile data, each user has own , access directly (by user id or else).

however, if storing order items range key makes more sense since want retrieve items grouped order.

in terms of data model, hash key allows uniquely identify record table, , range key can optionally used group , sort several records retrieved together. example: if defining aggregate store order items, order id hash key, , orderitemid range key. whenever search order items particular order, query hash key (order id), , order items.

you can find below formal definition use of these 2 keys:

"composite hash key range key allows developer create primary key composite of 2 attributes, 'hash attribute' , 'range attribute.' when querying against composite key, hash attribute needs uniquely matched range operation can specified range attribute: e.g. orders werner in past 24 hours, or games played individual player in past 24 hours." [vogels]

so range key adds grouping capability data model, however, use of these 2 keys have implication on storage model:

"dynamo uses consistent hashing partition key space across replicas , ensure uniform load distribution. uniform key distribution can achieve uniform load distribution assuming access distribution of keys not highly skewed." [ddb-sosp2007]

not hash key allows uniquely identify record, mechanism ensure load distribution. range key (when used) helps indicate records retrieved together, therefore, storage can optimized such need.

choosing correct keys represent data 1 of critical aspects during design process, , directly impacts how application perform, scale , cost.

footnotes:

the data model model through perceive , manipulate our data. describes how interact data in database [fowler]. in other words, how abstract data model, way group entities, attributes choose primary keys, etc
the storage model describes how database stores , manipulates data internally [fowler]. although cannot control directly, can optimize how data retrieved or written knowing how database works internally.

Autos

Search This Blog

Should I always create my DynamoDB tables using hash and range primary key type? -