Archives
Recent posts
Recent comments
- In Benchmarking requestAction
- David wrote: Why you are not counting database connection impact? Is it really so unimportant?
- In Nate Abele - PHP is dying
- Victor wrote: When you say the following: "[PHP] is just too simple. It requires only a few lines to generate...
- In Felix Geisendörfer - Git and CakePHP
- Josh wrote: I'd really like to figure out a way to clone the official git repo and merge it with my own in a...
- In Martin Radosta - Record level security based on SQL
- markstory wrote: Well the article was about a talk that Martin Radosta gave. I have not written this behavior. ...
- In Martin Radosta - Record level security based on SQL
- skitle wrote: Was curious if the files for this are available? I have searched through the SF site, and have...
Categories
Ryan Petersen - Collective Intelligence
Written on Wed, Dec 3rd 2008, 13:50 in CakeFest, CakePHP
Collective intelligence is a shared or group intelligence that emerges from the collaboration and competition of many individual. Some examples would be Amazon's book recommendations. Netflix uses collective intelligence to track their stock and supply changes, and more accurately allocate their resources.
Recommendations requires something to track like sales, or user preferences. A group of users is also required, the larger the group the better. A research survey found that a random sample of less that 1500 people will be the ideal and most efficient sample size.
There are many ways to estimate preferences. The most simple and basic one is Euclidean distance scoring. Pearson Correlation Score, also uses 2 axis to plot out the points, this method is used by amazon and delicious.
Stochastic Optimization
Theory of being able to optimize things. Requires a cost function, it is the most important part of optimization as it gathers the dataset used. Ie. Execution time of a request, time used for data integrity, error handling/logging, client bandwidth, data queries/filtering. When doing metrics and calculating cost you can use a unit cost and parts to unit. This will allow you to control and manage the final output. Depending on what you are measuring, you need to control different weighting. Random searching is an inefficient process, as it is difficult to determine whether you are moving toward or away from where you want to be. Writing unit costing functions is a specific to the task you want to achieve. When doing optimization with cost optimization, you need to be aware of local minimums. Once you have reached a new minimum or optimized level you need to do additional testing by expanding the range in either direction. By expanding your test range you help to ensure that you have found a more accurate low cost point, and aim to ensure that you haven't found a local low point.
Genetic Algorithms, and Hierarchal Grouping are other statistical methods to look into. Ryan is going to be releasing both Pearson Correlation and Euclid distance scoring will be released on the The bakery and SerenitySoft So check back for some code.

What is OpenID?
OpenID is a new open standard that lets you sign in to web sites with a single URL that you own. This URL can be your homepage or blog, or it can be provided to you by a web site you use. In either case, you only have to sign in once to your OpenID provider and so you only need to maintain a single password.Learn more.
How is CakeDC using OpenID?
You can use your OpenID identity when posting comments on the site. When you see a form field with
entering your OpenID identity is sufficient to allow your post. We also accept Google or Yahoo! identities. Simply use either "google.com" or "yahoo.com" and our OpenID library will locate your information from the appropriate source.
Comments:
Add New Comment
Good Reference for Collective Intelligence
Reply | Geoffrey Bonnycastle | posted on 14/12/08