Quantcast
Channel: David Mudrak's blog » php
Viewing all articles
Browse latest Browse all 6

Moodle development traffic 33/2010

$
0
0

Latest stable version 1.9.9+

There is just one commit into the stable branch from the last development week (from Tue Aug 17 to Mon Aug 23). Sam Marshall fixed a bug in a library that handles displaying of side blocks. The bug caused that a column space was reserved on the left or right side of the screen if there was an instance of a block even if that type of block was disabled at the given site (MDL-23871).

Future version Moodle 2.0 Preview 4

There are 85 commits into the main development branch from the last week. The branch is very near to the feature freeze point and testers are preparing for the second round of Moodle 2.0 QA testing, which will start together with the first release candidate.
Petr Škoda redesigned the concept of the internal constant CLI_SCRIPT. Until now, this constant has behaved as an autodetection result of whether the script is run via web or command line interface. It used to be used in if-statements to produce HTML output for browsers and plain text output for command line, for example. This issue is now handled properly by the output rendering mechanism and the concept of CLI_SCRIPT is now different. Developer uses this constant to explicitly declare that a script is supposed to be run via CLI only – by calling define('CLI_SCRIPT', true); before including main config.php. And the autodetection in setup.php just makes sure that such a script is really called via CLI. And vice-versa, it is not possible to run a script via CLI if it does not declare CLI_SCRIPT explicitly. This step makes the logic of CLI_SCRIPT constant similar to AJAX_SCRIPT and prevents accidental execution of CLI scripts via web and web scripts via CLI. Therefore it was needed to prepare a CLI version of /admin/cron.php file. So if you run cron.php via CLI, you must execute /admin/cli/cron.php script now. See MDL-23824 for details.

Quotes of the week

| .
Jordan Tomkinson plays pong game with Penny Leach via Jabber chat room

. |
Penny Leach strikes the ball back to Jordan

“David is going to surely put this in his blog”
Penny Leach was right

Mr Moodle, let me introduce you Mrs Statistics

Moodle 2.0 File API uses kind of content-addressable storage to keep course and user files on the server disk. Shortly said, every file is saved into the file pool with a filename that is calculated as SHA1 hash of the file content. If a file is copied (for example when the teacher is cloning the course), there is no need (and actually no way) to duplicate the file stored physically on the disk – just a new record in a special table of files is created.
All physical files are stored in so called file pool – a directory in moodledata. But as almost every file system has some limits in the number of files/sub-directories per each directory, it is necessary to distribute the files into sub-directories in the filepool. And here it comes interesting. The initial implementation of the file pool presented three levels of sub-directories to store each file. So if the SHA1 hash of the file content was e.g. 10a68843c08fe4446839961153812b94a1983c6b, such a file would be store in $CFG->dataroot/filedir/10/a6/88/10a68843c08fe4446839961153812b94a1983c6b
Ashley Holman from NetSpot realized that this distribution of files into three levels of sub-directories in the file pool is quite an overkill. If we expect that SHA1 hash has randomly distributed bit values, the chance of even having more than one file in the sub-directory seems to be around 1/16 millions, so it is pretty unlikely that even two files share the same directory at majority of Moodle sites. So we probably use four file descriptors (three directories plus one normal file) at the harddisk filesystem to keep a single file. That was considered as wasting of OS resources and the issue was filed as MDL-23885.
Eloy Lafuente mentioned a system he was working on in the past, which did not use fixed number of directory levels but was internally scaling itself as needed, so it would use just one directory level on small sites with several dozens of files and more of them on bigger sites with zillions of files. “I can say it has worked perfectly, scaling from 0 to 10 millions of files (today), with all the directories being filled as the system grows along the last 8 years (without waste of directory descriptions),” shared Eloy his experience.
Tim Hunt from the Open University was asked to come with a statistical analysis of the problem: “For N files (where N is an estimate of the most uploaded files any Moodle will ever have), come up with a directory structure for SHA1 hashes, so most folders contain ~1.000 files or sub-directories, and no sub-directory ever goes above 32.000 files (or at least the chance of that is vanishingly small).” Tim posted a nice report of the results and it was decided that there will be just two levels with 8 bits each to group the files in the file pool. Must be said that the structure of the file pool is internal implementation detail of File API layer and no-one is ever supposed to access it directly. Therefore it was not difficult to modify the code according the results of Tim’s research.
It is nice when software development is not just about the programming language.

Post scriptum

var_dump('php_sucks' == 0);


Viewing all articles
Browse latest Browse all 6

Trending Articles