You are here
Calculating Memory Requirements for NDB Storage Engine
While calculating the storage requirements in NDB, extra consideration is needed when calculating storage requirement for NDB tables. For tables using the NDB cluster storage engine, there is the factor of 4 – byte alignment to be taken into account when calculating storage requirements. This means that all NDB data storage is done in multiples of 4 bytes.
For Example, let’s say if a column takes 14 bytes to store. In NDB it requires 16 bytes to store. 2bytes will be padding. Because of this only in NDB TINYINT, SMALLINT, MEDUMINT, and INT all require 4 bytes storage per record due to the alignment factor. This rule is not applied in case of BIT data type.
BIT(X) – in NDB storage engine this column will take X bite of storage space, if a table definition contains 1 or more BIT column (up to 32 BIT columns) then NDB Cluster reserves 4 Bytes (32 bytes) per row for these . If the table definition contains more than 32 BIT columns, then NDB Cluster reserves 8 bytes per row. NDB storage engine also reserve 4 bytes per row if the table definition contains any columns defined as NULL up to32 NULL columns.
In NDB versions earlier than 5.1, all columns in NDB table were fixed in size and stored in memory. However in 5.1 the NDB storage defines supports variable size data types in memory also. Disk data columns are always stored as fixed width column. Each row is stored in one or two parts, depending on whether it has any variable- size columns are stored together as the fixed size part of the row. If the row has any variable sized columns, then the fixed size part of the row contain a pointer to a variable –size column is stored. Each row with TEXT or BLOB columns made up of two separate parts. Fixed 256 bytes of row are stored in a table and other 256 bytes are stored in hidden table. The rows in this second table are always 2,000 bytes long.
If the size of the rows is less than 256 then it is stored in single table. If this size of rows is greater than 256 then size is calculated by the following formula = 256+ X + (200-(X-256) %2000) ( X stands for the size of row)
Each row stored in an NDB table requires an overhead of 16 bytes per row in addition to added due to 4 byte alignment. For variable size data type 8 bytes of this is used as a pointer to the variable sized part of the row. Index also added to per-row storage requirement. Each ordered index uses 10 bytes of DataMemory per row. A hash index uses 25 bytes of IndexMemory per row.
In – Memory Storage
From MySQL 5.1, NDB storage can store data in memory and on disk. By default, MySQL cluster will set all tables created with NDB storage engine to in-memory data storage. Data in MySQL cluster can be stored either on disk or in memory. However indexes are always stored in memory. Because of in memory storage of indexes and data, retrieval of data is very fast. To calculate memory usage first step is calculating the amount of RAM that each row uses in each table. Then multiply the size of each row by the number of rows to get the size database and then work out the memory usage on each storage node.
To calculate data memory, calculate how much each row is going to use (Keep all consideration of data type in mind that are discussed above) Then add a fixed row overhead. Each ordered indexes like 10 bytes of storage. To calculate how many ordered indexes are there add all index you have defined and ordered indexes are also created when primary and unique indexes are defined (unless index are created with USING HASH option). Each primary key or hash index (defined by user or NDB automatically occupies 25 bytes per storage).
When a unique key is created other than primary key an additional hidden table is created. Each such table has 2 columns. One column has value declared as unique by user. The other column is a value generated NDB makes a primary key of this hidden table. Data in NDB is stored in pages. Each page stores up to 32, 765 bytes of data. Each page can only store data of one table and hidden table associated with that table.
No. of rows per pages = 32,768 Size of each row (Calculated above) This will tell you how many rows get per page of 128 bytes
Total memory requirement in byte = (Size of each row x No. rows) + (No. of pages x 128) This calculation needs to be done for all tables. RAM needed on each strong node = (Total memory requirement for one copy of your tables x No. of replicas) / No. of data nodes.
This only gives estimate for the RAM required for data storage. More RAM is required for buffers, temporary storage and so on.
Disk Based Storage
From MySQL 5.1 cluster can also use disk based storage by allowing columns that are not indexes to be stored on disk. The disk data is stored in a table space which is a collection of one or more data files on data node.
Undo logging Requires Space
In case of disk based storage, UNDO log files and data files are created which also use disk. The UNDO logs are stored on data nodes. A group of data files is called a table space. Group of log files is called a log file group. This log files group must be created before creating table space. Each table space is written to data files in table space are split up into extends. An extent can be defined as unit of allocation whose size is fixed at the time of table space creation. When a table requires space to store rows, it searches any extent that currently has free space. If no free space is found, a free extent in one of the data file of the table space is allocated to table. If there is no free extent available , no extent can be allocated and no more rows can be added to the table. An extent can have data from only one table.
Note: Views, trigger, stored routines, user, and accounts can’t be stored in the cluster. These components are required on each MySQL sever, than it is necessary to copy all components to all servers.