Tools for Creating, Automating And Replicating (CAAR)

Ok, as promised, here is a break down of the desktop tools I use daily.

The underlying theme is that these tools are great for Creating, Automating And Replicating (CAAR).

Ubuntu

My OS of choice for desktop is Ubuntu. I use it because it is the best supported FOSS software, its the easiest OS to install and it is very very stable and no less powerful than Fbsd which was OS for about 4-5 years.

I strongly feel that if you are going to work with Linux servers for hosting your websites you should also be using a linux distro as your desktop. You will be greatly more productive.

If you are running windows, buy another hard drive, and install Ubuntu on it, you will be creating and replicating in no time.

If you are still not willing to do that install cygwin.

At any rate you need a Apache, MySQL and PHP on your home box anyways, and linux makes that trivial to install.

zsh

Zsh is my shell of choice, oddly enough its both the most user friendly and powerful of all the commonly available shells.

Its ability to cycle through tab completes is my favorite function.

Emacs

Not only a text editor, it is also a religion. Emacs has my day planner. I use it to read mail via gnus. And basically emacs is always open when my computer is on.

I can use it to edit any kind of source code, php-mode, sql-mode, all my custom PHP coding hacks. I just couldn’t live without emacs.

It has tramp mode, which means i can edit files on other servers just like they were on my box, minimizing bandwidth usage and having to keep uploading and downloading the file.

The syntax to open the remote file is trivial to /victory@dfhu.org:~/www/blog/index.php. It is seamless.

I also know basic vim and use it for really quick edits, but it doesn't compare to the power of emacs.

Subversion

My version control system of choice is SVN, its quick and easy and keeps my source code well organized.

I am even starting to use svn propset svn:keywords 'Date Revision' now to keep the date/version in the individual files. The boiler plate is inserted via a emacs customization.

scp

I only get hosts which allow ssh access, this is almost all hosts so it is not much of a restriction. I don't know why people are still using FTP(S) when life is so much easier with scp. scp is a copy command using the SSH protocol.

I have all my SSH accounts wired up set up to use public key authentication (PGP). This is more secure and doesn't require me to have to key in password every time. You can do this manually from the commandline but ubuntu comes with Seahorse which can set up the keys in three easy steps (choose secure key, put in host, put in password).

rsync

rsync is king among men when it comes to replication. It uses ssh and some powerful magic i don't quite understand to remotely copy files to servers, but only sending the data that is really needed, it does all the permissions, stuff, it has an easy syntax for what not to replicate.

It makes replicating a one command deal, not screwing around with ftp. Please use rsync, if you are not already, you are hurting. I will write a post on rsync later.

screen

Gnu screen is a way to keep your shell scripts running on remote servers even when you are not logged in. So basically i can login to any one of my servers, run `screen whatever-script` and then detach it.

When i logout my script is still running. I can then reattach it later when i need it.

Please use screen if you are not already, it will save you plenty of time.

Xvfb

Xvfb is a tool to run a virtual X display. I use it when i need to run a program which requires a GUI window, but on a headless server.

Its not a huge tool, but i don't see that many people in IM talking about it so i thought i would mention it.

PHP5.2

I try to keep up to date with my programming language/OS's but not be on the latest or experimental branch. I am shamless about not coding for PHP4, its been dead for over a year. Who among you would code in PHP4 and bitch that your users are using IE6?

I would rather use Python (pylons) to code my websites, but pylons sites are not as easy to replicate and you need better webhosting.

PHP does have some really ugly bits. The two that annoy me the most are nasty looking named parameters and prior to PHP5, insanely ugly lamda functions.

My database of Choice: SQLite3

I love SQLite, its so much easier to replicate (just copy a file) than MySQL. MySQL which requires a deamon and setting up usernames and passwords and such.

You can have one sqlite file/db which you replicate globally (like bad words, good domains, spider ips, etc...) and then have a local database for each install of your website, which changes on a local basis. You rsync all your global databases to all the sites in your campaign and you are good to good (a one word one line command).

Similarly you can download all your sqlite databases and merg them together for one global piece of uber awesome data. It is trivial, with scp/rsync.

If you know MySQL it will only take a day or two to learn SQLite3. They are just different dialects of SQL, with many of the same features.

Note that in PHP Sqlite3 is the default of the PDO class, sqlite_* is SQLite2 which I don't like.

Firefox uses SQLite.

My database by Necessity is MySQL

Most of the time MySQL is huge over kill and often doesn't actually introduce any appreciable performance enhancements. Most problems with databases are with sloppy database design, not a the database software.

I learned MySQL before SQLite, because it _was_ more widely supported, now that PDO comes bundled standard with PHP binary builds, SQLite3 is extremely common and most webhosts are very cool about installing it with minutes if you ask.

Because everyone else uses it, i learned it, but i won't use it if i can use SQLite3.

Don't get me wrong MySQL is great, but you can't replicate sites that use it with the ease that you can with SQLite3.

jQuery Javascript

Javascript is a wonderful language if you learn to discipline yourself enough to use good coding style.

jQuery, I bow down the developers of jQuery who are able to find, nurture and grow the beauty of Javascript.

If you are coding javascript and not using jQuery, i don't really know what your argument is, other than not wanting to spend the five hours needed to learn jQuery.

The plugin system is great too.

Python

Python is such a beautiful and powerful language that is always a pleasure to code in it. If i need to really rip out some serious scrapping or random website/css design python is the boy. The NLTK has the key to doing super powerful article spinning which gets the most human readable content possible.

BeatifulSoup is super handy for scraping.

The downside of Python is that it doesn't run super fast, but i feel that my time is much more important than CPU time.

The other language type things i Use

Tools that i use that come to mind right now, there are probably more.

- sed
- awk
- lynx -dump (i.e the poor mans scraper)
- grep
- matlab/octave
- LaTeX
- eLisp
- C++
- find
- The Actual Bricks and Mortar Library

Summary

If you are using linux servers, go out and buy another hard drive (screw messing around with duel boot) and install Ubuntu.

Now that you have Ubuntu, install Apache, PHP5, MySQL, Emacs, rsync, sqlite3, screen, zsh, scp and learn these tools.

It will take a little time, for sure, but once you got it, your competition (who is still messing around with the alternatives) wont be able to touch you.

Are you in this game for only a year or two until you go back to pumping gas or sucking recycled air in a cubical or are you in this game for the long run? If you are in it for the long term, use the big boy tools.

Get to the library, get out the BOOKS (again don't try to google-as-you-go learn, IT IS A TRAP).

I am willing to take suggestions on topics in the comments, but until then, I am currently working on a release of my PHP script to dynamically optimize titles over time, which I think you will really like. Think PolyPageTitle (but not just for Wordpress) and Eli's blue hat technique number 19 (keyword spinning)

Share
Posted in lifestyle-mindset | Tagged , , , , , , , , , , , | 3 Comments

I Work

Hey, I wanted to step away from the specifics for a moment and talk to you about how I work.

I Turn Off the Internet

I only keep the Internet physically plugged in for a few hours a day and only when I absolutely need it. To go even further; I have not had an Internet connection at home for the last month or so.

I buy Ebooks, download the official manuals or go to the library (where I am now) for reference books. I can’t stress enough the importance of actually reading entire books instead of searching for code snippets every twenty minutes when I can’t remember something.

I have played the Google-as-you go learning style. IT IS A TRAP. Don’t do it, you will waste a huge amount of time. While i was Googling-as-i-went i was training my brain for short term understanding of the subject. Now that I have to really understand what I am reading and not just cut and paste my brain is TRAINED to pay attention and remember long term. This is huge. I am a way better coder now.

I am FORCED to automate to high heaven and to get code to work right the first time instead of a bunch of guess and check and google.

I am Working or I am Not

When I am working I try to stay glued to my work. Work is what I am doing. I create a block of time and I get done what i am supposed to get done or I don’t. If the deadline passes, i might sudo rm -rf project. No looking back, get it done, or don’t.

I spend as much time on the important, but not urgent things, as possible. This ties back in with turning off the Internet. The Internet tricks you into working on urgent matters, weather they are important or not.

I don’t beat myself up. I just stay happy and moving forward. Without endorphins i wouldn’t get anything done.

When I am not working I am playing, I am getting some fresh air, hanging out with people, I exercise, I listen to music, go for walks, I travel and move a lot.

I Screw Up

Sometimes I don’t work or play as hard as i should. Sometimes i don’t work up to my ideal. Sometimes for reasons I can’t fully understand, i just don’t do what i should be doing and i waste time.

Sometimes i swear i actively sabotage my success.

Although it sucks to have to admit it, sometimes i find my own internal drama life affirming.

If everything is going too smoothly then i am just not striving for something worthy enough. Looking at my over educated history, you can see this measurably in my report cards. I consistently got the best grades in the hardest or easiest classes, the stuff in the middle just wasn’t worth the mild effort.

I Secretly Want to Have A Partner or Two to Work With

Although I normally work very much alone* I dream of having people to work with, who work nearly as hard as me, but on complimentary skills. People who are great at outsourcing, like writing, like doing the maintenance, and have a few stellar ideas they want really bad to implement but have not.

I can come off as a jerk and hard headed to work with for reason that will become apparent in the next section.

* Over the last year or so I did work with someone as a partner. That person did not work well, did not have great ideas (had really, really great products) and i lost A LOT of money just for the honor of learning from a lot of business mistakes i will never ever make again. I am still digging to get out of the hole this made.

My Ego is Unaffected by Anyones Unvalidated Opinion of My Work

Putting it another way: I am in whatever business my clients want me to be in and anything i create, write, or work for is for them and only them.

Obviously i don’t work in niches that don’t have interesting people or interesting products, so don’t take me the wrong way and think i would start selling lead weights to fish as long as the margin was big enough.

That being said, I don’t care if your neighbor’s dog’s cousin’s mother thinks the color of your websites are too plain or that there is too much text, or that the navigation says what it means instead of some stupid made up language you want me to use.

If you can’t show me MEASURABLY how your ideas will lead to either happier customers (who refer) or ones that buy more stuff than frankly you can STFU.

Its not my problem if you were picked last in gym class and now you have to show those kids how “fashionable” or “cool” or “chic” or “novel” your storefront is.

The only measure of success is if you are getting cool, happy customers who are putting money into your pocket.

I Respect Superiority

It doesn’t matter how great we think we are, just about any other person we meet will know how to do, at least, one thing considerably better than us.

When i meet someone who has more experience, more training, more ability to accomplish something useful, i STFU and do what they say, no questions asked.

Even if I think i could do a better job then the person i am working with/for, unless i actually have verifiable PROOF that there is a better way to do things, i just STFU. I make sure i am the right hand man, getting done whatever i am tasked to get done.

Nobody cares how smart and knowledgeable i am, they just care about results.

Nobody Cares How Hard I Work

Nobody does or should care how hard I work. Not once in my entire life have i walked to the bank, handed over a check to cash and had the teller ask me “How hard did you work to get this money, because if it was really hard i might just add a couple extra zeros to it for you.”

Things I Need To Improve

- I don’t hold myself hard enough to the rule of “Always be working on the thing which has the most leverage.” In other words, there is almost always one thing i could be doing which would be making me the most money or saving me the most time, but i am not always doing it.

- I do not spend enough time networking with successful people.

- I do not spend enough time finding people who could be really successful if they had a little more guidance.

- I love coding and marketing so much that sometimes i get caught up in the thrill of building things to be “perfect” instead of just getting it out there and cashing the checks.

Over all i think i do a pretty darn good job and am happy with my work. My projects generally make money and my ideas generally turn out to be either correct or wrong in such a way that lessons can be learned from them.

This blog is going to live in conjunction with my biggest project yet, i am going to take on a huge, established, difficult niche which is full of suspicious and finicky prospects.

Seeing as this is going to be build from the ground up, i will be able to replay the drama here in abstraction.

Enjoy the ride, subscribe to the rss and keep in touch.

In the next post i will suggest some FOSS applications i think everyone who is into automation and replication should use.

Share
Posted in lifestyle-mindset | Tagged , , | 1 Comment

Lets Talk Affiliate Redirects Part1

It seems that many, many Internet Marketers are still not properly using redirect methods which protect against search engine penalties.

I will address redirects in a a few ways, because there is not a one size fits all method of dealing with all cases.

The basic laws of linking to affiliates

1. All legitimate traffic MUST get too the product page.

2. Links shouldn’t ever be traversed by search engines (The search engine should not be able to tell what the landing page/site of your links are).

3. Its almost always better to make it look like an affiliate link is really just MORE CONTENT on YOUR site which you don’t want crawled/indexed.

Following these three rules will allow you to go from looking like a spammy site, to an reading like an independent voice for the product your are trying to sell (provided you can get around duplicate content filters). It is more or less a one-eighty from the SEO perspective.

Using robots.txt

The robots.txt is a text file which you put in the root directory of your site, which tells which directories should NOT be crawled. The syntax is pretty trivial for our cases.

Example robots.txt

User-agent: *
Disallow: /info/
Disallow: /bikes/
Disallow: /af/
Disallow: /zaf/

Putting this in http://yoursite.faux/robots.txt will disallow the directories http://yoursite.faux/info/, http://yoursite.faux/bikes/, etc … from being crawled. If they do get crawled you have the right to slap 403 forbidden for the wrong user agents.

NOTE: these folders do not have to actually be on your server at all. We will use mod_rewrite to take care of pairing the links as shown on your site with the affiliate landing page.

mod_rewrite is standard on most Apache webservers and is a very powerful tool if you take the time to learn it. I will just show some very basic examples here.

Example .htaccess

<IfModule mod_rewrite.c>
 
RewriteEngine On
RewriteBase /
 
# Redirect http://example.com/info/foo.php to
# http://dfhu.org/landing.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^info/foo.php http://dfhu.org/landing.php [R,L]
 
# Redirect any request for any file in /bikes/ to
# http://dfhu.org/bikes.php?myid=44
RewriteCond %{REQUEST_FILENAME} !-f 
RewriteRule ^bikes/ http://dfhu.org/bikes.php?myid=44 [R,L]
 
# Append the file name as the value of the product url paramater.
# i.e. http://example.com/af/big_kite.php would be redirected to
# http://dfhu.org/p.php?id=1&prd=big_kite
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^af/(.*)\.php http://dfhu.org/p.php?id=1&prd=$1 [R,L]
 
# For any file requested in /zaf/ Query String Append that to the
# redirect url i.e.
# http://example.com/zaf/keyboard.php?cat=7&wtf=true would go to
# http://dfhu.org/product.php?affid=77&cat=7&wtf=true
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^zaf/ http://dfhu.org/product.php?affid=77 [QSA,R,L]
 
</IfModule>

Really that is pretty much it. You can now link to the same page one hundred times from your site, using a different URL each time and well behaving search spiders will not know that you are just laboring over just a few landing pages.

An Example

Just to be clear let me show an example using the most common case. Let say our site is http://bignameindie.faux/

bignameindie.faux .htaccess

<IfModule mod_rewrite.c>
 
RewriteEngine On
RewriteBase /
 
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^rev/(.*)\.php http://indie.faux/p.php?id=1&prd=$1 [R,L]
 
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^dld/(.*)\.mp3 http://indie.faux/?id=1&prd=$1 [R,L]
 
</IfModule>

Then we have to make sure that directories dld and rev are not crawled by the search engines so we update our http://bignameindie.faux/robots.txt as follows.

bignameindie.faux/robots.txt

User-agent: *
Disallow: /dld/
Disallow: /rev/

Then a page on our bignameindie.faux site might look like:

bignameindie.faux/reviews.php

<html>
 <head>
  <title>Big Name Indie Reviews</title>
 </head>
 <body>
   <p>
<q>I'm am bones</q> doesn't have that many great songs,
but i would suggest giving <a 
 href="http://bingnameindie.faux/dld/ostritch_approach.mp3" 
 title="Download The ostrich approach.mp3">The Ostrich Approach
</a> a chance. Really listen to it, its great.
   </p>
   <p>
When were you guys going to tell me about <q>Fujiya + Miyagi</q>?
I can't get enough of <a 
 href="/rev/Knickerbocker.php"
 title="Review of Knickerbocker">Knickerbocker</a>! It is cheesy
but catchy. 
   </p>
   <p>
Ok, now for some really good Indie music without the need for a 
gimmick. Everyone will love <a 
 href="/rev/band-of-horses.php"
 title="Review of Band of Horses">Band of Horses</a>. A great place
to start with them is to listen to <a 
 href="/dld/the-general-specific.php"
 title="Listen to the General Specific">The General Specific</a>.
   </p>
 </body>
</html>

If you are not doing something like this already you are very likely to see improvements in indexing and ranking.

This isn’t going to be the last time I talk about affiliate redirects, because its a fundamental piece of SEO and Marketing which is often overlooked.

The spoiler is that you need to create your own tinyurl style site which is not to be indexed at all, which you then use as a hub to redirect all the affiliate links in your empire through, so you have an insane amount of control in pushing users to the affiliates who are converting best (or the ones that are giving out the coolest prizes/vacations for top sales). As a bonus you can get some pretty snazzy statistics.

There is a bit more too it than that, because in the future you might want to make those links flow juice to some of your other projects, so you can update the redirects on site and take down the robots.txt restriction.

When you start really getting your kingdoms in order, traffic and link juice are just two fluids to be irrigated to parts of your empire with the most favorable growing conditions.

Share
Posted in beginner-programming | Tagged , , , | 2 Comments

Where You Been Web Analytics

What would you do if you had access to all your visitors browser history? You would know if they have already visited your affiliate programs landing page, your competitors, other sites in your campaign. You may be able to make an educated guess about their age, sex, level of affluence, ethnicity and interests. You could redirect, greet or sell differently to each visitor depending on where they have been.

Well with this free (BSD License) whereyoubeen script you can do pretty much that. Well more specifically you can get an answer YES or NO to if they visited a given page in a list of URLS your provided.

Where You Been

Is a collection of scripts that I have created to test a list of user provided URLs to see if they are in the visitors browser history the save them to a database.

Javascript (jQuery) is used to check the rendered color of links in a hidden DOM object to compare if they are the a:visited color or just the normal a color.

A jQuery.ajax() call is made to POST the array of visited links to a PHP script which then dumps them into a SQLite database.

A simple stats script is also provided, which shows the frequency of users who have visited each of pages in your list given that they have visited at least one of the pages in your list.

Download WhereYouBeen

You can download the whereyoubeen tool ready to go right here. Note that it requires PHP5.2 with PDO/Sqlite support. PDO/SQLite comes standard with current binary builds of PHP. If you think your host isn’t running PHP5 try adding the following to your .htaccess file AddHandler application/x-httpd-php5 .php.

Now lets look at how we can invoke the script in an php file to be show to surfers.

Calling Where You been. (index.php)

<html>
 <head>
  <title>My Webpage</title>
  <?php
   // if you already have jQuery included, you can set this to True
   $ALREADY_HAVE_JQUERY=False;
   // set this to the path where the script files are
   $WYB_PATH="/wyb";
   // this includes all the headers and javascript
   include($_SERVER['DOCUMENT_ROOT'] .
           "$WYB_PATH/wyb_header.inc.php"); 
  ?>
 
 </head>
 <body>
   <p>This some content to show the user</p>
 </body>
</html>

Thats pretty much it. You can edit wyb-sites-to-check.txt to customize what EXACT URLS to check. The stats can be shown in wyb_stats.php.

Onto The Code

Now for those that are interested, lets look at some of the files that make up the whereyoubeen tools.

The header, drags all the links to check into the javascript, and then calls the whereyoubeen been javascript function.

WYB Header (wyb_header.inc.php)

<?php
/*
  @file: wyb_header.inc.php - prints the whereyoubeen javascript. This
    file should be included in the page you want to check. Using
    something like
 
 
         // If you are already using jquery, set this to True
         $ALREAY_HAVE_JQUERY=False;
         // would be www.example.com/whereyoubeen/
         $WYB_PATH="/whereyoubeen"; 
         // this includes the header
         include($_SERVER['DOCUMENT_ROOT'] .
          "$WYB_PATH/wyb_header.inc.php"); 
 
 
 
  @author: Victory
  @site: http://dfhu.org/blog/
  @version: 1.0
  @date: 090721
  @license: BSD
 
 */
 
// A better check would be to see if $_COOKIES are set, but that would
// require a set page and a check page, until then just check to see
// if this is a Mozilla/Opera/Safari browser but if not ...
if(!preg_match("/^(Mozilla|Opera)/",$_SERVER['HTTP_USER_AGENT'])){
  // ... bail.
  echo $_SERVER['HTTP_USER_AGENT'];
  return;
}
 
// construct the that path to whereyoubeen on the server
$WYB_ABS_PATH=
  $_SERVER['DOCUMENT_ROOT'] . 
  "$WYB_PATH/";
 
// open up the sqlite database;
$db=new PDO("sqlite:$WYB_ABS_PATH/db/wyb.sqlite");
 
// Now, Check to see if the user's IP is in the users table by ...
$sql="
SELECT
 rowid
FROM 
 users
WHERE
 remote_addr=:remote_addr
 ";
// ... preparing and ...
$stmt=$db->prepare($sql);
// ... excuting the the $sql statment using the users IP address.
$stmt->execute(Array(":remote_addr"=>$_SERVER['REMOTE_ADDR']));
 
// if the user's ip is in the database ...
if($stmt->fetch()){
  // ... return and don't run any tests.
  return;
}
 
// So now that we decided we are going to go ahead with the tests,
// lets ensure that the jquery variable is set.
if(!isset($ALREADY_HAVE_JQUERY)){
  $ALREADY_HAVE_JQUERY=False;
}
 
// If we need jquery ...
if(!$ALREADY_HAVE_JQUERY){
  // ... then print the script element to include it.
echo "
<script 
  type=\"text/javascript\" 
  src=\"$WYB_PATH/js/jquery-1.3.2.min.js\"></script>
";
}
 
// Print the include statament for whereyoubeen.js which contains the
// logic to check which sites the user has been to.
echo "
<script 
  type=\"text/javascript\"
  src=\"$WYB_PATH/js/whereyoubeen.js\"></script>
";
?>
 
<style type="text/css">
// set up different colors for visited/non visited links
ul#silent_append a{
color: #F00 !important;
}
ul#silent_append a:visited{
color: #00F !important;
}
</style>
 
<script type="text/javascript">
 
<?php
  // If you already have jquery in your page, then don't mess with
  // your preference for conflict.
  if(!$ALREADY_HAVE_JQUERY){
    echo "jQuery.noConflict();";
  }
 
// Wait for the document to be ready ...
?>
jQuery(document).ready(function(){
 
  <?php
    // .. and when it is we need to need the urls to check.  Including
    // wyb_urls_get.php will produce the a javascript parsable list of
    // quoted urls. We store that array in utc (Urls To Check).
  ?>
  var utc = 
    [<?php 
     include($_SERVER['DOCUMENT_ROOT'] . 
             "$WYB_PATH/wyb_urls_get.php"); 
     ?>];
 
  <?php
    // For all the links in utc, check to see if they are in the
    // user's history. This is done by checking the color of the
    // links as rendered in a hidden element of the dom.
  ?>
  var visited=
    whereyoubeen(
      utc,
      '<?php echo $WYB_PATH; ?>/wyb_urls_save.php');
 
  <?php
 
  // You could also use 'visited' here to change the DOM, for instance
  // create a meta redirect, or place a "Warning About Competitor,"
  // popup on the page and so on.
 
  ?>
});
 
</script>

Running this on the (index.php) above will give something like:

Actualized (index.php)

<html>
 <head>
  <title>My Webpage</title>
 
  <script 
    type="text/javascript" 
    src="/whereyoubeen/js/jquery-1.3.2.min.js"></script>
 
   <script 
     type="text/javascript"
     src="/whereyoubeen/js/whereyoubeen.js"></script>
 
 <style type="text/css">
  // set up different colors for visited/non visited links
   ul#silent_append a{
   color: #F00 !important;
  }
  ul#silent_append a:visited{
   color: #00F !important;
  }
 
</style>
 
<script type="text/javascript">
 
jQuery.noConflict();
jQuery(document).ready(function(){
 
    var utc = 
      ['http://dfhu.org/blog/index.php',
       'http://www.bing.com/',
       'http://www.whycanttoryread.com/',
       'http://dfhu.org/blog/',
       'http://www.ebay.com/',
       'http://www.craigslist.org/',
       'http://chicago.craigslist.org/',
       'http://exactly.com/as/itwould.html'];
 
    var visited=
      whereyoubeen(
        utc,
        '/wyb/wyb_urls_save.php');
});
 
 </script>
 </head>
 <body>
   <p>This some content to show the user</p>
 </body>
</html>

NOTE: that this will not show every visit, but only on 1 visit per day. This is accomplished with a SQLite queue which is created using a TRIGGER . The following shows the database schema. Running this script also clears out any data in the database.

The Database Schema (wyb_makedb.php)

 
<?php
 
$db=new PDO("sqlite:db/wyb.sqlite");
 
$sql="DROP TABLE IF EXISTS users";
$db->query($sql);
 
$sql="
CREATE TABLE IF NOT EXISTS users (
 remote_host TEXT,
 remote_addr TEXT UNIQUE,
 last_visit DATETIME DEFAULT CURRENT_TIMESTAMP
);";
$db->query($sql);
 
 
$sql="DROP INDEX IF EXISTS remote_addr_idx";
$db->query($sql);
 
 
$sql="
CREATE INDEX IF NOT EXISTS remote_addr_idx 
 ON users(remote_addr)
";
$db->query($sql);
 
 
$sql="DROP TABLE IF EXISTS whereyoubeen";
$db->query($sql);
 
$sql="
CREATE TABLE IF NOT EXISTS whereyoubeen (
 remote_addr TEXT,
 remote_host TEXT,
 user_agent TEXT,
 url TEXT,
 last_visit DATETIME DEFAULT CURRENT_TIMESTAMP
)";
$db->query($sql);
 
$sql="
CREATE TRIGGER IF NOT EXISTS
 clean_up_old 
BEFORE INSERT ON 
 users
BEGIN
 DELETE FROM 
  users 
 WHERE 
  last_visit < DATETIME('NOW','-1 day');
END
";
$db->query($sql);
?>

I didn’t comment this much, because SQL (being a functional language and all) is pretty easy to read directly. When you are testing a new setup, you can run this script inbetween visits to clean out the user data (so wyb_header.inc.php will fire off). Otherwise, you could use sqlite3 command line client to delete manually.

Manually deleting user data with SQLite

shell% sqlite3              
SQLite version 3.5.9
Enter ".help" for instructions
sqlite> attach database 'db/wyb.sqlite' as wyb;
sqlite> delete from wyb.users;

The code for getting wyb_urls_get.inc.php and saving wyb_urls_save.inc.php are not really that interesting, they are well commented and if you have questions you can post them in the comments.

Now looks look at the javascript, remember that this requires jQuery.

Pseudo-Searching Browser History (js/whereyoubeen.js)

 
function whereyoubeen(urlsToCheck,path_to_savelinks){
 
  // If we have no links to check ...
  if(urlsToCheck.length == 0){
    // ... just bail.
    return;
  }
 
  // So we have links to check and don't like to write 'jQuery' over
  // and over again so we shorten it to $.
  $=jQuery; 
 
 
  // To organize links a bit we build a hidden ul so that we can
  // append the links to test.
  $('body')
    .append("<ul id='silent_append'></li>");
 
 
  // Lets create a closure to append links to ...
  function appendLink(href,id){
    // ... append a link to #silent_append.
 
    // @param string href - link's href
    //
    // @param string id - the id of li that holds the link. If its
    // not set a random id will be choosen
    //
    // Returns id which may have been generated randomly
 
    // We are going to use the id later to find css and to remove,
    // so lets create one if we don't have one
    if(!id){
      id=Math.floor(Math.random()*10000);
    }
 
    // A modest, but proud, link is created here.
    var link = 
    '<a id="' +id+ '" href="'
      +href+'">'+href+'</a>';
 
    // Place that link in the id="silent_append" ul where it will be
    // easy to get to.
    $("#silent_append")
      .append('<li style="display:block;" id="li_' +id+
                  '">'+link+'</li>');	       
 
    return id;
  }// appendLink
 
 
 
  // We are going to see what the css color is for a URL that has not
  // been visted, so here is a URL that hasn't been visted.
  var noVisited = 
    'http://' + 
    Math.floor(Math.random()*1000000) +  
    ".com";
 
  // Now if we append a link we havn't visted and one that we have
  // visited then ...
  appendLink(document.location,"yes_visted");
  appendLink(noVisited,'no_visted');
 
  // ... we can calibrate using the links color to see what the colors
  // are for visited and unvisited links.
  var yesVisitedColor =
    $("#yes_visted").css("color");
  var noVisitedColor = 
    $("#no_visted").css("color");
 
  // If those colors are the same ...
  if(yesVisitedColor == noVisitedColor){
    // ... then just forget about it, your css is crap, i am going
    // home.
    return; 
  }
 
  // Remove the elements from the dom to keep the site moving
  // zippy.
  $("#li_yes_visted").remove();
  $("#li_no_visted").remove();
 
 
  // We need a place to store links that have been visited, lets use an
  // array()
  var visited=Array();
 
  // now foreach of the links to check ...
  for(i in urlsToCheck){
    // ... append them to our ul so that we can ...
    idOfString=appendLink(urlsToCheck[i]);
 
    // ... check their color against the visited color and ...
    var curColor=$("#" + idOfString).css("color")
    if(curColor == yesVisitedColor){
      // ... if it matches we append it to the visited array.
      visited[visited.length]=urlsToCheck[i];
    }
 
    // we won't be needing that link clogging up the dom anymore so
    // lets remove it.  
    $("#li_" + idOfString).remove();
  }// for urlsToCheck
 
  // We then send our results off to the database via PHP.
  $.ajax({
    type: "POST",
    url: path_to_savelinks,
    data: {'visited': visited.join("|||")},
    success: function(data){
      // you could uncomment this if you wanted savelinks to say
      // something, or maybe send back an XML object 
      //$('body').append("<p>" + data + "</p>");
    }
  });
 
  // finally we return visited so it can be used to affect the dom
  return visited;
};

The highlights of the script are to create visited and unvisited links, to calibrate for color. The rendered color is a tell-tale sign of weather a users has visited a given page or not.

The script then uses an AJAX to send the data to PHP which in turn sends it off to the SQLite database.

I really need feedback from you.

If you are technical I would love for you to point out any bugs or issues you see. I also like to chat about coding styles and idioms so feel free to post that flavor of discussion too.

If you are not technical then let me know if you had any problems setting up the script and any cleaver ideas of how to use this very juicy information.

There are countless mods and possibilities for these tools, I would love to hear your ideas on what this could be used for. If its simple enough i might implement it for free and put it on here, if its more involved we can work out a fair price for private coding work.

This is given with a BSD license so you can use it in your commercial projects and distribute it on your website (I just ask for link back to me dfhu.org).

Share
Posted in intermediate.programming | Tagged , | Leave a comment

RSS Generator For Importing to Wordpress

Maybe all that MySQL command line editing in the last post was a bit more than you wanted to deal with, in which case this PHP function may be handy for you.

This function takes an array of items and creates an rss feed with a slowdrip staggering of publish dates.

rss_maker_for_wordpress()

 
function rss_maker_for_wordpress($items,
                                 $dates=.5,
                                 $back_date=1.0){
 
  /*
 
    Builds a (possibly invalid XML) RSS feed suitable for importing to
    wordpress.
 
    Author: Victory
    Release Date: 090815
    Version: 1.0
 
    For the description of the paramters see function rss_maker();
 
  */
 
  return rss_maker($items,
                   $dates,
                   $back_date,
                   "Wordpress Importable",
                   "http://example.net/",
                   "A WP Feed",
                   True,
                   False);
}

rss_maker_for_wordpress() requires the more general rss_maker() function.

rss_maker()

 
 
function rss_maker($items,
                   $dates=.5,
                   $back_date=0,
                   $title="The Feed", 
                   $link="http://example.com/",  
                   $description="A Feed",
                   $no_guid=False,
                   $escape_with_cdata=True){
 
  /*
    Builds a (possibly invalid XML) RSS feed.
 
    Author: Victory
    Release Date: 090815
    Version: 3.0
 
    @param array(array()) $items - The items to be inserted into the
    array. The inner array must set 'title','link' and 'description.'
 
    exmaple:
 
    $items=Array(Array("title"=>"Item 1",
                       "description"=>"Body of text for item 1"),
                 Array("title"=>"Second Item",
                       "description"=>"Body of text for item 2"));
 
    @param [array or float] $dates - If an array then its a list of
    Published Dates (PubDate) in date("r",mktime()) format, if its a
    float, create them randomly using choosen dates steped the number
    of days in the future.
 
    @param float $back_date - the number of days before today to start
    creating random dates (ignored if $dates is an array)
 
    @param string $title - string of the the title of the feed
    (ignored by wordpress)
 
    @param string $link - the feed's link (ignored by wordpress)
 
    @param string $description - the feed's description (ignored by
    wordpress)
 
    Basic Usage:
 
    $my_feed=
      rss_maker_for_wordpress(
       Array(
        Array("title"=>"Item 1",
              "description"=>"Body of text for item 1"),
        Array("title"=>"Second Item",
              "description"=>"Body of text for item 2")));
 
    NOTE: To get valid XML you would need to htmlentites() escape
    $items description.
 
  */
 
 
  /* start by setting the feeds channel info */
 
  // So if the dates are not set explicitly ...
  if(!is_array($dates)){
    // ... use a random time this morning as the pubdate,
    $pubDate = mktime(rand(0,5),rand(0,50),rand(0,50),
                      date("n"),date("j"),date("Y"));
  }else{// ... otherwise use the first date of $dates
    $pubDate = $dates[0];
  }
 
  // now get link, pubdate etc...
  $pubDate=date("r", $pubDate);
  $rss_link=$link;
  $rss_title=$title;
  $rss_description=$description;
  $rss_language="en-us";
 
  // and shove that into the feeds head.
  $feed[]="<" . "?xml version=\"1.0\"" . "?" . ">
<rss version=\"2.0\">
  <channel>
    <title><![CDATA[$rss_title]]></title>
    <link>$rss_link</link>
    <description><![CDATA[$rss_description]]></description>
    <language>$rss_language</language>
    <pubDate>$pubDate</pubDate>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  ";
 
 
  // We can set up a template for a feed item. Note that if we wanted
  // to be more confident that this will create a valid feed than we
  // can use <![CDATA[%variable%]]> here instead, but wordpress and
  // others doen't always play nice with such feeds. Also you would
  // have to set a <guid>%link%#%guid%</guid> for this to be truly
  // valid and the items in $items would need a $item['link'];
 
  $rss_item="
    <item>
      <link>%link%</link>
  ";
  if($escape_with_cdata){
    $rss_item.="
      <title><![CDATA[%title%]]></title>
      <description><![CDATA[%description%]]></description>
  ";
  }else{
    $rss_item.="
      <title>%title%</title>
      <description>%description%</description>      
  ";
  }
  if(isset($items[0]['link'])){
    $rss_item .= "
      <guid>%link%#%guid%</guid>
  ";
  }
  $rss_item .="
      <pubDate>%pubdate%</pubDate>
    </item>
  ";
 
  // if dates aren't set then make dates randomly in the future
  if(!is_array($dates)){
 
    // the max time interval in minutes (1440 minutes in a day)
    $cur_day=-intval($back_date * 1440);
 
    $max_time_interval=max(1,intval($dates * 1440));
    $min_time_interval=intval(floor($max_time_interval/2.0));
 
    // switch date back an array
    $dates=Array();
 
    // create a random date for every item in the list
    for($i=0; $i<count($items); $i++){
 
      // step the cur_day forward
      $cur_day+=rand($min_time_interval,
                     $max_time_interval);
 
      $dates[]=
        mktime(rand(12,20),
               rand(0,55),
               rand(0,55),
               date("n",strtotime("+$cur_day minutes")),
               date("j",strtotime("+$cur_day minutes")),
               date("Y",strtotime("+$cur_day minutes")));
 
 
    }
    // "blog" order
    $dates=array_reverse($dates);
  }
 
  // We need to set up the variables array for the template
  $vars=Array("%title%","%link%","%description%",
              "%pubdate%","%guid%");
 
  // now just sub in the our values and append them to the feed
  foreach($items as $i=>$item){
    $pubDate=date("r", $dates[$i]);
    $vals=Array($item["title"],$item["link"],$item["description"],
                $pubDate,rand(10000,99999));
    $feed[]=str_replace($vars,$vals,$rss_item);
  }
 
  $feed[]="
  </channel>
</rss>
";
  return implode("\n",$feed);
}

Please tell me if you found this tool and post were useful, if not maybe you could suggest a topic.

Share
Posted in intermediate.programming | Tagged , , , | 3 Comments