Simple PHP Cacheing With Mixed Static and Dynamic Content

One way to speed up your development time and website is to use cache in PHP. The method I use is made to respect Rapid CAAR Development. It doesn’t require any external libraries and will work on the vast majority of web hosts (even the really cheap ones).

Download PHP Caching

PHP cache can greatly reduce server load and let you get those few thousand extra page views out of your web host, with only adding few a few lines of code to your source.

Useing The PHP Cache

<?php
 
// these two files are in the store zip file
require("./includes/conf.inc.php");
require("./includes/debug.inc.php");
 
if(CACHE_ENABLE == 1){
  include("./includes/cache.inc.php");
}else{  
  function cache_start(){return True;};
  function cache_end(){};
}
 
?>
<html>
  <head>
    <title>Zoooom!</title>
   <head>
   <body>
<?php
 
echo "Hello: {$_SESSION['user_name']} <br>";
 
if(cache_start()){
  echo "Here is the results of a long running Problem";
  solve_riemann_hypothesis_numerically();
}cache_end();
 
echo "Here is some not cached info {$_SESSION['user_name']}";
 
if(cache_start()){
 echo "But here is something else that is cached";
 poincare_conjecture(new Geometry('Narwhal Tooth'));
}cache_end();
?>
   </body>
</html>

For this to work you need to have a file called cache/ which is in the same directory as the includes/ directory. The cache/ folder must be read and writable by use that Apache is running as.

The directory

% cd /path/to/yoursite.faux/
% whoami
apache 
% ls
-r-------- apache apache index.php
dr-x------ apache apache includes/
drwx------ apache apache cache/

Its pretty straight forward to use really. Just include the file and make sure cache/ is readable.

To clear cache you visit http://yoursite.faux/?clear_cache=clear you can set the “password” in the includes/conf.inc.php file. In this example the password is clear.

There are other configuration options in there as well that you can use if you need them.

Now all thats left is to download the code. There are some other dirs in there as well because this is taken directly from my foundation sites creating svn repository.

Download PHP Caching

Now on to the Code

The work horse for the cache is the output buffering ob_* built in PHP functions. Namely ob_start(),ob_get_contents() and ob_get_contents() and ob_end_flush().

If you are not familiar with those functions you should check out the documentation after reading over this code.

Before we get into the main code it think it will help to look at the relevant configuration file options.

DFHU Conf for Cache (includes/conf.inc.php)

 
<?php
// Cache Variables
define("CACHE_ENABLE",1);
// http://yoursite.faux/?clear_cache=clear will delete the cache 
define("CACHE_CLEAR_PASSWORD","clear");
// set this to 0 if your site doesn't use conical urls (bad idea)
define("CACHE_CONICAL_URLS", 1);
// higher means call cache open, low means call cache rarely
define("CACHE_CALL_PROBABILITY",90);
define("CACHE_NAME_SALT","Put Random String Here");
?>

And now onto the main caching functions for PHP.

DFHU PHP Cache Functions (includes/cache.inc.php)

 
<?php
/*
 
@author: Victory
@site: http://dfhu.org
@copyright: dfhu.org
@report_bugs: bugs(at)dfhu.org
@feature_request: features(at)dfhu.org
@file: dfhufoundation/includes/cache.inc.php
@license: BSD
 
@description:
 
  This is a function to accomplish caching. PHP5
 
  The cache dir should be set to ../cache/ and should be read and
  writable by Apache.
 
 
 
$Date:: 2009-08-05 12:01:59 #$:
$Rev:: 4                     $:
 
*/
 
// The cache number is incremented, everytime you call cache_start();
global $CACHE_NUMBER;
$CACHE_NUMBER=0; 
 
global $CACHE_COLLECTING;
$CACHE_COLLECTING=False;
 
// I know where you live cache dir, and realpath doesn't do what you
// want if the cache dir doesn't exists in PHP<5.3.
global $CACHE_DIR;
$CACHE_DIR=preg_replace(";/includes$;","/cache/",dirname(__FILE__));
 
 
// We need a variable to see if we should try to pull the cache or
// should we recalculate the contents of the cache. We call this
// variable $CACHE_CALL_THIS_TIME and we ...
global $CACHE_CALL_THIS_TIME;
// ... set it to true the probablity a random number is less than the
// probablity expressed as percentage with three sig figs.
$cache_call_probability = 
  min(995,CACHE_CALL_PROBABILITY*10);
if(rand(0,1000) < $cache_call_probability){
  $CACHE_CALL_THIS_TIME=True;
}else{
  $CACHE_CALL_THIS_TIME=False;
}
 
 
 
// Make sure the directory is writable
if(!is_dir($CACHE_DIR) or
   !is_writable($CACHE_DIR)){
 
  // This is my here() function it is defined in debug.inc.php. It
  // prints the message and a backtrace
  here("ERROR: cache dir won't let me write: $CACHE_DIR");
  exit;
}
 
// if we were sent something like
// http://mysite.faux/?clear_cache=thepassword_set_in_conf ...
if(isset($_GET['clear_cache']) and 
   $_GET['clear_cache'] == CACHE_CLEAR_PASSWORD){
  // ... then we delete ALL the '.cache' files in cache_dir
 
  // To do that we open up the dir (which we know is valid because we
  // did a is_dir/is_writable above) and ...
  $d = dir($CACHE_DIR);
  // ... iterate over all the files in the dir
  while (false !== ($entry = $d->read())) {
    // ... and if they are cache files ...
    if(!preg_match("/\.cache$/",$entry))
      continue;
    // ... then we can delete them (either using a loud and obnoxious
    // version or a soft and sneaky version)
    if(defined(DEBUG_LEVEL) and DEBUG_LEVEL > 2){
      echo "<br>deleting: " . $d->path . $entry . "<br>";
      unlink($d->path . $entry);
    }else{
      // The @ means i am sneaky i can fail an no one will ever know!
      @unlink($d->path . $entry);
    }
  }
  // Time to close up shop senior Dir.
  $d->close();
}// The end of delete all .cache files.
 
 
function get_cache_name(){
  // Create a filename for the cache files, using the REQUEST_URI, It
  // turns out to be something like return
  // md5($request.$salt).$extension;
 
  // We use the cache number to see which section of the page this
  // cache block is associated with. If you start adding random
  // cache_start blocks without clearing cache you will know why all
  // the WTF Bunnies are chewing you to bits.
  global $CACHE_NUMBER;
 
  // We create the extension for cache file.
  $cache_extension="-$CACHE_NUMBER.cache";
  // And shake out the salt.
  $salt=CACHE_NAME_SALT;
 
  // Now if you are enlightened you will use, conical urls and thus
  // will want to ...
  if(CACHE_CONICAL_URLS == 1){
    // ... parse the REQUEST_URI so that you can ...
    $url_bits=parse_url($_SERVER['REQUEST_URI']);
    // ... cosider just the path when ...
    $path=$url_bits['path'];
    // ... creating and returning the cache name.
    return md5($path.$salt) . $cache_extension;
  }
 
  // Apparently you have some good reason for not having conical
  // urls. I am guessing you are trying some SEO reverse psychology on
  // Google, let me know how that works out for you. /snideRemark
  return md5($_SERVER['REQUEST_URI'].$salt) . $cache_extension;  
}
 
function cache_start(){
  // This function, cache_start(), either starts output buffering or
  // reads outputs the cache by reading from the .cache file
 
 
  // This is set to true if we have started output buffering and false
  // otherwise.
  global $CACHE_COLLECTING;
  // The place on the page for this section of cache.
  global $CACHE_NUMBER;
  // This is the directory where the cache is stored.
  global $CACHE_DIR;
  // This is True if we should try to pull cache from the file or it
  // is false if we should re-populate the cache.
  global $CACHE_CALL_THIS_TIME;
 
 
  // Ok lets update the cache index for the page.
  $CACHE_NUMBER+=1;
 
  // construct the absolute path to the appropriate cache file.
  $cache_name=$CACHE_DIR . get_cache_name();
 
 
  // Should we try to pull the cache from the file this time and is it
  // writable if so than ...
  if($CACHE_CALL_THIS_TIME and
     is_readable($cache_name)){
    // ... we just dump the results of the file to the screen, its
    // very tempting to use include but that would be a security
    // nightmear, file_get_contents isn't super meaga ultra fast so
    // maybe someone could load the file another way.
    echo file_get_contents($cache_name);
    return False;
  }
 
  /* NOTE: You could have done something like the following if you
           wanted have the cache file update according to its age, but
           i am not really into that kind of crazyness.
 
     $five_minutes_old=time() - 60*5;
     if($five_minutes_old > filemtime($cache_name) and
        is_readable($cache_name)){
 
     }
  */
 
  // Set the global cache to know that we should be collecting text
  // and be ready to cache it when call cache_end();
  $CACHE_COLLECTING=True;  
  ob_start();
  return True;
}
 
function cache_end(){
  global $CACHE_COLLECTING;
 
  // If we were not collecting cache before, than just return.
  if(!$CACHE_COLLECTING){
    return;
  }
 
  // We are going to process the cache so turn off the cache
  // collecting flag.
  $CACHE_COLLECTING=False;
 
  // Now we need to get all the contents of buffer and store it in a
  // text file.
  cache_store(ob_get_contents());
 
  // finally we want to output the cache or else we will have some
  // confused surfers.
  ob_end_flush();
}
 
function cache_store($cache_contents){
  global $CACHE_DIR;
 
  // open up the cache dir and and store the contents of the output
  // buffer in it.
  $cache_name=$CACHE_DIR . get_cache_name();
  $fp=fopen($cache_name,'w');
  fwrite($fp,$cache_contents);
  fclose($fp);
}
 
?>

You should use cache whenever it will reduce the amount of time you need to think about optimizations for your code, or when your code is slow or when it will save you bandwidth (for instance its perfect for sites that use ccenter).

Share
Posted in intermediate.programming | Tagged , , , , , | Leave a comment

How To Pick A Niche And Dominate: The DFHU Way Pt1

It was a toss up for what i should post first. Either post “how to structure your networks” or “how you can pick your niches and products to sell.” The two ideas are so intertwined in the DFHU way of thinking that its hard to separate. I mean the network is built the way it is because you pick the products you do and you pick the products you do because they work well in building your SEO Empire.

Let me just say that there are many ways to pick niches. This method is congruent with CAAR and DFHU mindsets and why i will push you to use this method if you are going to get the most out of dfhu.ORG.

Each level of your Network has its own requirements for Monetization and Links

When i talk about levels of your network it can roughly be broken up into 5 vertically integrated levels.

Level 4 (Pollinators) – Sites you make, but don’t have complete control over (squidoo, free hosting, blogspot, etc…). These pages are mostly for link volume, getting other pages indexed and getting links from domains with authority/aged domains.

Level 3 (Farmers) – Sites you have control over, but are just database sites, directory sites, article sites, auto [^b]logs or some other kind of set-it-and-forget site. Sites which are mostly there to have pages indexed so that they can irrigate potential link juice.

Level 2 (Distillers) – These are sites where you gather up your off topic link juice (mostly from Farmers) and distill it into something that is on topic or on demographic. You give these sites some social love and occasionally put some link bait up. Humans generally write a decent proportion of these mini sites. The objective is to keep the GYB engines in the know on what is on topic, so they get the volume of links to rank but are built on topic. Now you have on topic sites with rank to send link/traffic juice up your network.

Level 1 (Merrymakers) – Now you have Pollinators throwing there capricious energy to best advantage, the Farmers are irrigating link juice to the Distillers’ refineries. The Distillers are sending off the distilled link juice to the Merrymakers.

The Merrymakers are the first sites you invest the kind of time needed to build a relationship. This is where you capture contact info (emails, names, mailing addresses, interests, demographics, etc…) This is the place to use WhereYouBeen. This is where you take the time to make contact with on a regular basis (once a week, once every few days, maybe even once a day). This is where you get user created content, with your onsite SEO being Mauve Hat.

Level 0 (Disc0) – These are the sites that you are emotionally invested in. These are the sites that when people ask you “what you do” you say I run BrandedBySound.faux, JennetsJewelery.faux, or iLikeTurtles.faux. This is what all the other sites in your network are there to support. This is discØ baby.

Rules of Thumb:

• Each level of the network needs a form of monetization complement to the effort required to get the quality and quantity of traffic it needs.

• Each level of the network must pay its creation and upkeep fee. Generally speaking this most be done directly but sometimes link juice is fair. As an example: if its $2/month for server resources, $1/month for the domain name and $80 to create the site, then the site has to be on average making ($3*12+$80)/12.0=$9.66/month to break-even. A reasonable target for Pollinators, Farmers and Distillers is three times the monthly upkeep. So in this example if you were making $28.98 per site you would be doing OK. You can do similar math for buying links instead of using your own content for Pollinators.

• Don’t let Merrymakers turn into time sinks. If you can’t get them to write content on their own, outsource people to create forum posts, blog comments and so on to stir up hype for these pages. Its too easy to start spending so much time refreshing the page to see if you have something to reply just trying to keep the threads alive. ITS A TRAP. There are plenty of people in the third world who really need work from cool people like you.

• If you can’t get excited at the disco then you are probably in the wrong gig. These are the sites you actually WANT to write and ENJOY getting email from prospects. This is the kind of site that gets your blood flowing. The Disc0 should be full of cool clients that either share a lot in common with you or you have a great deal of respect for them.

• The one rule thats never really worth breaking of disc0 is that you must price yourself high enough such that profit margins are good enough so that you can pay the absolute MOST for customer acquisition of any of your competitors. If you were an adultIM you would be yawning at how obvious this is, but your not an adultIM so it probably comes from a bit left of field and sounds scary. Its pretty much self evident if you give yourself a second to think about it (an elusive obvious).

Disc0 is where you want to be able to give the absolute best customer experience. These people are cool, you are cool, make this a business you can be proud of.

• The farther away from Disc0 you are in the network the broader the market should be for product or service you are pitching. For example if Disc0 is “Silvered Spun Glass Earrings” then Farmers might put up links and content in the form of “Women’s Apparel, ” “Romance Novels,” “Women’s Perfume.”

The Distillers would be more along the line of “Glass Jewelery,” “Spun Glass Jewelery” or even “Silvered Spun Glass Jewelry.” The Merrymakers could have ads pages/sections for each of the keywords. “Silver jewelry,” “Silvered jewelry” “Spun glass” “Glass Jewelery”, “Spunglass,” “Silver Earrings,” etc… with you siphoning off any of the ad space which are for products you sell at Disc0.

Thats all well and Good, but how do i pick a Niche?

This post is already pretty darn long so i am going to postpone writing the part that you actually wanted to read until another post …

…. Ok but just a little one now.

A pragmatic rule i adhere to strictly when picking disc0 is that the product i am selling has to have been sold off line before and preferably since before the Internet was even around (prior to 1994).

A few quick reasons for this:

- On reason is that purely online products are generally only sold to net savvy (addicted) people. Net savvy people tend to zoom around the Internet in such a non-linear fashion at such a high speed that they are hard to capture (compare google traffic vs Yahoo traffic). Also they tend to think they know everything because they can google anything.

- A better reason to sell things which have been sold offline, is research, history, and the back story. The majority of “premium” in premium priced products and services is the _story_ that it affords the purchaser. Products with history have story. I mean if you are selling “Online Day planner” software you can’t really tell the same kind of story you could for “handcrafted leather bound journals.”

Ethan Allen defiantly maneuvered before the imperial navy. He followed the route they knew he would take, when he said he would take it. It was, at last, the Cannon fire the embiggened his resolve to live …. so he wrote in his ZakyWaky DOT COM Online AJAX Enhanced Day E-Planner, “Today was cromulent, very-very cromulent indeed.” Fin.

I think the Leather Bound Day Planner is going to give you more story.

- Everything old is new again. You can create value out of thin air by attaching the lipstick of technology to your products bacony lips. For example if you are selling Koi ponds you can have “no purchases necessary” contests for the best themed Koi pond, each month/season you can have a different theme. People can upload their Koi pond pics to the site, comments, ratings and so on. Add functionality so people can easily make pimped out slideshows of there fish, send them the DVDs of the videos, and so…

- If your business has nothing offline you can get stuck in-front of the computer forever. At least if you sell real products you can get some sunshine sometimes while you are making some money.

- Most people don’t want to deal with real crap because they think its bulky and a pain to move around. This means the competition is much thicker for online products. Really there are so many things to sell which require little or no shipping (i.e. services) that it is really not an issue.

Don’t pick online products just because you worried about having to do manual labor, anyways other people can do the manual labor for you. They are called employees, they are the worse part about business (well other than taxes), but they can get the job done.

Enough for now, this post is too long as it is, so i will have to come back with a part 2. Subscribe to the RSS feed right this very second.

Share
Posted in marketing | Tagged , , | 5 Comments

Harvest Comment Spammer’s Proxies

Yesterday we were hijacking comment links, today we are scoping the spammer’s proxies.

The post title pretty much says it all, but I’m in the library now and its raining pretty hard and I don’t feel like walking home so let me flesh it out a bit.

Spammers worth half their salt will use proxies (often “open”) to get around IP blocking measures.

Knowing that, doesn’t it make sense to put the IPs of your comments through a proxy checking script? This is an old hat trick for anyone who plays on IRC.

When someone POSTs to your blog, you put their IP in a database, you then port scan thoses IPs for everyone’s favorite open proxy ports 80,8080,3128, etc…

If any are open you try to connect through it as a proxy. Hey you already know that there is 90% chance that if its open it supports POST.

You can give it a go, by using the wordpress comments on one of your open *logs.

Get Comment IPs From Wordpress DB

mysql> SELECT comment_author_IP FROM `wp_comments`;

So why not leave comments open so they can post their little rants, you hijack there links and scope their proxies? Well because the text sucks, but hey you can clean that up too. You could leave them open, but then erase them with cron every night or two.

Comments are just one example, also record IPs for failed captchas, off-the-screen text areas, questionable forum posts, etc…

Why wont this work for email spam you say? Because most of those spammers use rented botnets, which don’t have open ports.

Share
Posted in beginner-programming | Tagged | 2 Comments

Hijack Comment Links: A Wordpress Plugin

I joined Wicked Fire a few days ago where i caught a post by Matt3 about hijacking links in your Wordpress blogs.

Basically, he said that some of the scripts he had been using were scraping comments, but leaving the comments links intact, which is generally not very helpful.

This Plugin will take a list of links (one per line), search through comments, for http:// style strings and replace them with your links.

It will also remove “nofollow” from links if you ask it to.

By default it works only on new posts, but it can work retroactively on old posts as well. If you use the retro active mode then there is no guarantee that the links won’t change there targets when you update your list of URLs. Chances are that 75% of the time, you will not want to use the retro active link injection feature.

Download Hijack Comment Links

Install it like any other wordpress plugin. I.e. unzip it to ./wp-contents/plugins/ and active from wp-admin.

No i don’t use this script on this site, but i do ‘follow’ your comments so why don’t you take a second and leave a comment.

Hijack Comment Links Code

For those who like to look over the code.

zzz-hijack-comment-links.php

 
<?php
/*
 
Plugin Name: Hijack Comment Links
Plugin URI: http://dfhu.org/scripts/
Description: Replace Comment Links With Links of Your Choice
Author: Victory
Version: 1.0 (really see source for right Version number)
Author URI: http://dfhu.org/
 
@author: Victory
@site: http://dfhu.org
@copyright: dfhu.org
@report_bugs: bugs(at)dfhu.org
@feature_request: features(at)dfhu.org
@file: zzz-hijack-comment-links/zzz-hijack-comment-links.php
@license: BSD
@version: 5
 
@description:
 
  This Wordpress Plugin, replaces the 'http://.*' style strings in
  comments with random urls of your choice.
 
$Date:: 2009-08-01 15:36:46 #$:
$Rev ::                      $:
 
*/
 
 
function hijack_comment_links_warning() {
  // Post a Little message to the wordpress admin to let people know
  // they have yet to update the list of links they wish to inject.
 
  echo "
<div id='dfhu-warning' class='updated fade'>
<p><strong>"
.__('Hijack Comment Links is Almost Ready.')
."</strong> "
.sprintf(__('You must 
<a href="%1$s">enter the links you want to inject</a> for it to work.'), 
	 "options-general.php?page=hijack-comment-links-group")
."</p></div>
";
}
 
function hijack_comment_links_menu() {
  // Just creat the submenu for the options.
  add_submenu_page('options-general.php',
		   'Hijack Comment Links Options', 
		   'Hijack Comment Links Options', 
		   8, 
		   'hijack-comment-links-group', 
		   'hijack_comment_links_options');   
}
 
// If the admin hasn't put any links in the yet ...
if(get_option('hjcl_links_to_inject') == ""){
  // ... then we suggest that she does so.
  add_action('admin_notices', 
	     'hijack_comment_links_warning');
}
 
// if we are admin ...
if ( is_admin() ){
  // then let us mess around with hjcl's settings
  add_action('admin_menu', 'hijack_comment_links_menu');
  add_action('admin_init', 'hijack_comment_links_settings' );
}
 
function hijack_comment_links_settings(){
  // register the links, nofollow and retro settings
  register_setting('hijack-comment-links-group', 
		   'hjcl_links_to_inject');
  register_setting('hijack-comment-links-group', 
		   'hjcl_remove_nofollow');
  register_setting('hijack-comment-links-group', 
		   'hjcl_retro');
}
 
 
function hijack_comment_links_options() { 
  // The Options Page for HJCL
  ?>
  <h2>Hijack Comment Links Options</h2>
  <form method="post" action="options.php">
    <input type="hidden" name="action" value="update" />
 
    <input 
  type="hidden"
  name="page_options" 
  value="hjcl_links_to_inject,hjcl_remove_nofollow,hjcl_retro"/>
 
    <?php wp_nonce_field('update-options');  
         settings_fields('hijack-comment-links-group' );
    ?>
 
    <table class="form-table">
 
      <tr>
	<th scope="row">
Enter <b>1</b> To Remove nofollow,
<b>0</b> To Keep nofollow.
	</th>
	<td>
	  <input 
	     type="text" 
	     name="hjcl_remove_nofollow" 
	     value="<?php 
echo intval(get_option('hjcl_remove_nofollow')); 
           ?>">
	</td>
      </tr>
 
      <tr>
	<th scope="row">
Enter <b>1</b> To Effect Links Retroactively (not recommended), 
<b>0</b> To just inject on new posts.
	</th>
	<td>
	  <input 
	     type="text" 
	     name="hjcl_retro" 
	     value="<?php 
echo intval(get_option('hjcl_retro')); 
           ?>">
	</td>
      </tr>
 
 
 
      <tr valign="top">
	<th scope="row">
	  <b>List of URLs to Inject. One Per Line.</b> 
	</th>
	<td>
	  <textarea 
	     cols="80"
	     rows="30"
             name="hjcl_links_to_inject"><?php
$hjcl_links=explode("\n",get_option('hjcl_links_to_inject'));
if(count($hjcl_links) < 3){
 echo "http://dfhu.org/
http://dfhu.org/blog/\n";
}
foreach($hjcl_links as $hjcl){  
  echo trim($hjcl) . "\n";
 
}
          ?></textarea>
	</td>
      </tr>
 
    </table>
 
    <p class="submit">
      <input type="submit" 
	     class="button-primary" 
	     value="<?php _e('Save Changes') ?>" />
    </p>
 
  </form>
</div>
 
<?php
}
 
 
function hijack_comment_links($comment){
  /* 
     Searches through the comment text for and replaces them with one
     of your preset http(s) urls
  */
 
  // really small comments dont, have links, so ...
  if(strlen($comment) < 15){
    // we can just return the comment as is
    return $comment;
  }
 
 
  // We want to seed the random generator. We use the sum of the
  // Ordinal of the 2nd 12th,14th and 10th letters of the comment, why
  // those numbers?
 
  // we want to store the old seed, so that we can re-randomize the
  // PRG again.
  $o_seed=rand();
  srand(
   array_sum(
    array_map(
     'ord',
     array($comment[1],
	   $comment[11],
	   $comment[13],
	   $comment[10]))));
 
  // The links are stored, one per line, so we explode over the
  // newline char.
  $links=explode("\n",get_option('hjcl_links_to_inject'));
 
  // Lets make sure we have links, but if we dont ...
  if(!is_array($links)){
    // ... then just return the comment as is.
    return $comment;
  }
 
  // hey! we have links, long comments, lets get a random url
  $link=$links[array_rand($links)];
 
  // hey random number generator, here is your seed back.
  srand($o_seed); 
 
 
  // We can check to see if we are going get all crazy and remove no
  // follows, so our link juice sales.
  if(get_option('hjcl_remove_nofollow') == 1){
    // link juice be sailing, so nofollow is a no gone.
    $comment=preg_replace("/nofollow/","",$comment);
  }
 
  // the most flawless regex ever created by mankind to catch a
  // http(s) url, /sarcasm. If you don't like this regex you can
  // spruce it up to your likings. Or maybe you could do something
  // with var_filter(), or whatever makes you happy, for me, here is
  // the the regex.
  $r=';https?://[^" ]+;';
 
  // Now just replace all the urls in the comment with our url of
  // random choice.
  return preg_replace($r,$link.'"',$comment);
}
 
 
// Before the links get iserted we hijack the links if any. We call as
// the very last step, before the comment gets packed into the db.
add_filter('preprocess_comment', 'hijack_comment_links', 100);
 
 
// If you want the injection to work retroactively on old posts, This
// will results in links being randomized everytime you edit your
// injected link list, its not recommened.
 
/**/
if(get_option('hjcl_retro') == 1){
  add_filter('comment_text', 'hijack_comment_links', 100);
  add_filter('comment_text_rss', 'hijack_comment_links', 100);
  add_filter('comment_excerpt', 'hijack_comment_links', 100);
}
/**/
 
?>

So that about does it for today, I don’t wan to make sure that i mix it up between coding, marketing and lifestyle posts.

I am still working on the title tweaker posts and am asking for feedback over at the SQUIRT forum.

Also i will be getting on with redirects seeing as people told me that it was useful. Sometimes the most basic stuff is the most powerful.

You can try and catch me on Skype using the username thrilling_victory if you have questions or comments or just want to network.

Share
Posted in beginner-programming | Leave a comment

Bing’s Big Off Topic Bonus

One of the hardest problems in Natural Language Processing (NLP) is Word Sense Disambiguation (WSD). WSD is the ability to calculate the meaning of a word from context.

When someone searches for a short keyword, (without prior user data) WSD becomes darn near impossible for anything except the trivial case. The big G has for the most part been more focused on getting its users to be more specific. Microsoft/Bing, which loves to set the bar as low as possible for its users, just seems to want give you a little taste of results from a few of the possible meanings or say this is “what you mean now beeeyatch”. Which, frankly i think is a good idea for the average use, its basically like they are fleshing out some of the rationalization for “suggest” function while you are typing in your query.

So lets just Bing to your attention some examples.

The most in your face is probably

A Place in Korea [edit: this SERP has since been fixed]

But there are less egregious examples

A Language Parents Can Understand

Of course when they really nail it, they add in helpful sectioning (props to that, real competition in usage over G):

A Word with a Rainbow of Meanings.

Then again if they really, REALLY think they got your number, then you are feeling lucky weather you like it or not!

Not A Swedish Hair Salon

Oh wait, maybe hair care is a new major or something.

It seems they really take into consideration your location for the WSD. The above Harvard/Hårvård joke probably wont work in Europe.

The take away of this post is that, right now, there are cases where you can find keywords in your niche which are off-topic words in harder niches and get a little extra love from bing. Consider this bonus when choosing your next tertiary keywords.

I think I am not alone in now focusing plenty of my SEO time on Bing, the traffic is good (convertalicious ) and so far Bing seems to really like my foundation level sites.

Caveat, this post shows live search results, search is a Complex Adaptive System (CAS) and exact results will change often and according to your IP.

Share
Posted in seo-bing | Tagged , , | 2 Comments