Skip to content

A model behavior to make batch inserts and updates to a MySQL database, as well as fetching results in the form of a hashmap

Notifications You must be signed in to change notification settings

omair-inam/CakePHP_Big_Data

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Big Data Behavior

An easy way to efficiently insert, update, and work with large amounts of data using CakePHP.

Background

My company uses CakePHP for most of our applications. However, we were running into efficiency issues when working with large amounts of data.

It’s not uncommon for us to insert (or update) hundreds of thousands of rows with a single process. Additionally, we needed an efficient way to work with those hundreds of thousands of pieces of data.

So, after some investigation I narrowed our efficiency problem to CakePHP sending data to the database, one row at a time. This works great out of the box,

but will really slow things down once large amounts of data come into play. I remedied this by creating this behavior that allows a model to have a “bundle” of objects.

This bundle is stored in memory. Upon saving the bundle, all of the model objects are inserted into the database as a bulk insert, 100,000 items per insert by default.

Additionally, this behavior allows CakePHP find results to be returned in the form of a hashed array. The user can specify a ‘key’, which will serve as the key of the returned associative array.

Requirements

  • CakePHP 1.3
  • PHP 5.2+
  • MySQL

Installation

  1. Download this: https://github.com/jarriett/cake_big_data/zipball/master
  2. Unzip the downloaded file
  3. Copy the resulting folder to app/plugins

Issues

  • If debugging is enabled, PHP notices are generated and logged. If this behavior is being used for very large amounts of data, the log files can grow very quickly due to the generated PHP notices.

Usage

Have your model use the behavior:


<?php
        App::Import('Model', 'BigDataModel');
	class Frog extends BigDataModel
	{
	       var $actsAs = array('BigData');
	}
?>

Now to insert 100,000 rows to the database in one database call, do the following:


	for($i=0;$i <= 100000; $i++)
	{
		$frog = array();
		$frog['Frog']['color'] = 'green';
		$frog['Frog']['name'] = $i . " Froggy";
		$frog['Frog']['unique_name'] = md5(mktime());
		$frog['Frog']['species_id'] = 7;
		$this->Frog->addToBundle($frog);
	}
	$this->Frog->saveBundle();
//* Note: If a unique key exists on the database table, by default any rows matching the unique key will be updated,

To fetch a hashed result set from the database, call the fetchHashedResult() function as demonstrated below:

		$frog_hash = $this->Frog->fetchHashedResult(array('conditions' => array('Frog.species_id' => 7), 'fields' => array('Frog.name', 'Frog.color'), 'key' => array('Frog.name', 'Frog.color')));

The previous function call returns an associative array, where each object’s key is + . If you would like to md5 the key, can add the ‘useHash’ => true value to the parameter array.

Authors

License

Licensed under the MIT License Redistributions of files must retain the above copyright notice.

About

A model behavior to make batch inserts and updates to a MySQL database, as well as fetching results in the form of a hashmap

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PHP 100.0%