Sunday, December 11, 2011


Travelling Salesman Problem is quite a common task in development world yet it turned to be very hard to find simple modern solutions.
So I had to look through tons of source codes and awful "old-school" websites to get normal description of Branch And Bound method that is most likely to solve this problem for a limited number of cities.

Here you can see the PHP source code.
This solution takes an input of time limit and file with cities and their coordinates.
It outputs the best length found within given time period.

Usage:
Save php code as solve.php
set $time_limit = 5*60-20; to the time limit you need.
Save cities data as cities.txt
run php solve.php
get solution, that's it!


Download source files here TravellingSalesmanProblem-BranchAndBound-PHP.zip Solve.php
<?php
set_time_limit(300);
define(INFIN, 1000000000000);
$start_time = microtime(true);

#-------------------
# read input file
#-------------------
$c = file('cities.txt');
$cities = array();
foreach($c as $v) {
 $v = trim($v);
 if (!$v) continue;
 $cities []= explode("\t", $v);
}

#-------------------
# calculate distances
#-------------------
$distances = array();
foreach($cities as $k1 => $v1)
 foreach($cities as $k2 => $v2) {
  if ($k1 == $k2)
   $distances[$k1][$k2] = -1;
  else
   $distances[$k1][$k2] = distance($v1[1], $v1[2], $v2[1], $v2[2]);
 }

#-------------------
# We don't need to return to starting point, so add a zero fake node.
# But we need to start in node 0, so incoming ribs will be 0 weight, outgoing to node 0 also 0 weight, and all other outgoing an INFIN
#-------------------
$distances []= array_fill(0, count($distances), INFIN);
foreach($distances as $k => $v) $distances[$k] []= 0;
$distances[count($distances)-1][count($distances)-1] = -1;
$distances[count($distances)-1][0] = 0;


#-------------------
# setup params to main algo and start
#-------------------
$iterations = 0;
$time_limit = 5*60-20;
TSP::$distances = $distances;
TSP::$time_limit = $time_limit;
TSP::BranchAndBound();

#-------------------
# output results
#-------------------
output_best_result();


/**
* Class that processes whole TSP solution
*/
class TSP {
 
 public static $reduced_matrix = array();
 public static $distances = array();
 public static $time_limit = 0;
 public static $level = 0;
 public static $best_cost = INFIN;
 public static $best_path = array();
 public static $current_path = array();
 
 /**
 * Main entry point
 * All params should be saved in static properties prior to calling this
 */
 public function BranchAndBound() {
  
  $n = count(TSP::$distances);
  for ($i = 0; $i < $n; $i++)
   for ($j = 0; $j < $n; $j++)
    TSP::$current_path[$i][$j] = 0;

  TSP::$reduced_matrix = TSP::$distances;
  $initCost = TSP::ReduceCostMatrix(TSP::$reduced_matrix);

  $totalCost = TSP::findPath($initCost, TSP::$reduced_matrix, TSP::$current_path);

  return $totalCost;
 }
 
 /**
 * Recursive function, split all choices in to 2
 * @param int $current_cost Current cost of the path
 * @param array $current_matrix Current matrix with distances
 * @param array $current_path Current matrix with chosen ribs
 */
 public function findPath($current_cost, $current_matrix, $current_path) {
  
  TSP::$level++;
  $n = count(TSP::$distances);
  $count = 0;

  list($i, $j) = TSP::findHighestZero($current_matrix);
  
  #-------------------
  # check if it's a last node and we have found a new optimal path
  #-------------------
  if ($i == -1 && $j == -1) {
   $path = TSP::getBestPath($current_path);
   if (count($path) >= count($current_path)-1 && 
    $current_cost < TSP::$best_cost) {
    TSP::$best_cost = $current_cost;
    TSP::$best_path = $current_path;
   }
   TSP::$level--;
   return INFIN;
  }

  #-------------------
  # Check time restrictions.
  # Output current results immediately if running out of time.
  #-------------------
  global $iterations, $start_time;
  if ($iterations++ % 10 == 0 && microtime(true) - $start_time > TSP::$time_limit)
   output_best_result();
  
  $tempReduceRow = $current_matrix;
  $tempReduceCol = $current_matrix;

  TSP::OmitLeft($i, $j, $tempReduceRow, $current_path);
  $t = TSP::ReduceCostMatrix($tempReduceRow);
  $LBound = $current_cost + $t;

  TSP::OmitRight($i, $j, $tempReduceCol);
  $t = TSP::ReduceCostMatrix($tempReduceCol);
  $RBound = $current_cost + $t; 

  $pathCost = $pathCost2 = INFIN;
  if ($LBound <= $RBound && $LBound < TSP::$best_cost) {
   $current_path[$i][$j] = 1;
   $pathCost = TSP::findPath($LBound, $tempReduceRow, $current_path);
   if ($pathCost > $RBound && $RBound < TSP::$best_cost) {
    $current_path[$i][$j] = 0;
    $pathCost2 = TSP::findPath($RBound, $tempReduceCol, $current_path);
   }
  } elseif ($RBound < TSP::$best_cost) {
   $pathCost = TSP::findPath($RBound, $tempReduceCol, $current_path);
   if ($pathCost > $LBound && $LBound < TSP::$best_cost) {
    $current_path[$i][$j] = 1;
    $pathCost2 = TSP::findPath($LBound, $tempReduceRow, $current_path);
   }
  }
  $current_path[$i][$j] = 0;
  TSP::$level--;
  return min($pathCost, $pathCost2);
 }

 /**
 * Process "left" subtotal of all choices
 * @param int $row Row number
 * @param int $col Col number
 * @param array $temp Array to update with weights
 * @param array $current_path Current matrix with chosen ribs
 */
 public function OmitLeft($row, $col, &$temp, $current_path) {   
  for ($j = 0; $j < count($temp); $j++)
   $temp[$row][$j] = -1;
  for ($i = 0; $i < count($temp); $i++)
   $temp[$i][$col] = -1;
  $temp[$col][$row] = -1;
  
  
  #-------------------
  # remove all ribs from matrix that can potentially form a cycle after adding current rib
  #-------------------
  $current_path[$row][$col] = 1;
  if (TSP::sumMatrix($current_path) < count($current_path)-1) {
   $fl = false;
   for ($i = 0; $i < count($temp); $i++)
    for ($j = 0; $j < count($temp); $j++)
     if ($temp[$i][$j] != -1 && TSP::isCycle($i, $j, $current_path)) {
      $temp[$i][$j] = -1;
      $fl = true;
     }
  }
 }
 
 /**
 * Process "right" subtotal of all choices
 * @param int $row Row number
 * @param int $col Col number
 * @param array $temp Array to update with weights
 */
 public function OmitRight($row, $col, &$temp) {
  $temp[$row][$col] = -1;
 }
 
 /**
 * Reduce costs of a matrix by rows AND cols
 * @param array $temp Matrix to update
 */
 public function ReduceCostMatrix(&$temp) {
  return TSP::ReduceRows($temp) + TSP::ReduceCols($temp);
 }
 
 /**
 * Reduce costs of a matrix by rows 
 * @param array $temp Matrix to update
 */
 public function ReduceRows(&$temp) {
  $n = count($temp);
  $minBound = 0;
  for ($i = 0; $i < $n; $i++) {
   $min = INFIN;
   $flag = true;
   for ($j = 0; $j < $n; $j++)
   if ($temp[$i][$j] < $min && $temp[$i][$j] != -1) {
    $min = $temp[$i][$j];
    $flag = false;
   }
   if (!$flag) {
    TSP::SubtractRow($i, $min, $temp);
    $minBound += $min;
   }
  }
  return $minBound;
 }
 
 /**
 * Reduce costs of a matrix by cols
 * @param array $temp Matrix to update
 */
 public function ReduceCols(&$temp) {
  $n = count($temp);
  $minBound = 0;
  for ($j = 0; $j < $n; $j++) {
   $min = INFIN;
   $flag = true;
   for ($i = 0; $i < $n; $i++)
   if ($temp[$i][$j] < $min && $temp[$i][$j] != -1) {
    $min = $temp[$i][$j];
    $flag = false;
   }
   if (!$flag) {
    TSP::SubtractCol($j, $min, $temp);
    $minBound += $min;
   }
  }
  return $minBound; 
 }
 
 /**
 * Subtract a value from matrix row
 * @param int $row Row to update
 * @param float $sub Value to subtract
 * @param array $temp Matrix to update
 */
 public function SubtractRow($row, $sub, &$temp) {
  for ($j = 0; $j < count($temp[0]); $j++)
   if ($temp[$row][$j] != -1) $temp[$row][$j] -= $sub;     
 }
 
 /**
 * Subtract a value from matrix col
 * @param int $col Col to update
 * @param float $sub Value to subtract
 * @param array $temp Matrix to update
 */
 public function SubtractCol($col, $sub, &$temp) {
  for ($j = 0; $j< count($temp); $j++)
   if ($temp[$j][$col] != -1) $temp[$j][$col] -= $sub;
 }
 
 /**
 * Find "heaviest" zero in a matrix
 * @param array $temp Matrix to use
 */
 public function findHighestZero($temp) {
  $n = count($temp);
  $max = -1;
  $maxI = $maxJ = -1;
  for ($i = 0; $i < $n; $i++) {
   for ($j = 0; $j < $n; $j++)
    if ($temp[$i][$j] == 0) {
     $minI = $minJ = INFIN;
     for($k = 0; $k < $n; $k++) {
      if ($temp[$i][$k] != -1 && $j != $k && $temp[$i][$k] < $minI) $minI = $temp[$i][$k];
      if ($temp[$k][$j] != -1 && $i != $k && $temp[$k][$j] < $minJ) $minJ = $temp[$k][$j];
     }
     $minI = ($minI == INFIN ? 0 : $minI);
     $minJ = ($minJ == INFIN ? 0 : $minJ);
     if ($minI + $minJ > $max) {
      $max = $minI + $minJ;
      $maxI = $i;
      $maxJ = $j;
     }
    }
  }
  return array($maxI, $maxJ);
 }
 
 /**
 * Generate a plain path from a graph matrix
 * @param array $matrix Matrix to update
 */
 public static function getBestPath($matrix) {
  $n = count($matrix);
  $path = array();
  
  #-------------------
  # find any node connected with fake one - it will start the path
  #-------------------
  $start_node = -1;
  for($i = 0; $i < $n; $i++)
  if ($matrix[$n-1][$i] == 1) {
   $start_node = $i;
   break;
  }
  
  #-------------------
  # make the matrix symmetrical for easier path generation
  # and use $n-1 as we cut out the last fake node
  #-------------------
  for($i = 0; $i < $n-1; $i++)
   for($j = 0; $j < $n-1; $j++)
    if ($matrix[$i][$j] == 1) $matrix[$j][$i] = 1;
  
  #-------------------
  # walk and generate the path
  #-------------------
  $path []= $start_node;
  $next_node = $start_node;
  while(($next_node = TSP::getPathNextNode($next_node, $path, $matrix)) != -1)
   $path []= $next_node;
  
  return $path;
 }

 /**
 * Subroutine for getBestPath(), finds a rib coming from given node
 * @param int $row Node
 * @param array $path Existing path
 * @param array $temp Matrix to use
 */
 public static function getPathNextNode($row, $path, $matrix) {
  for($i = 0; $i < count($matrix)-1; $i++)
   if ($matrix[$row][$i] == 1 && !in_array($i, $path)) return $i;
  return -1;
 }
 
 /**
 * Sum all values in a matrix row
 * @param int $row Row
 * @param array $matrix Matrix to use
 */
 public static function SumRow($row, $matrix) {
  $res = 0;
  foreach($matrix[$row] as $v) $res += $v;
  return $res;
 }
 
 /**
 * Sum all values in a matrix col
 * @param int $col Col
 * @param array $matrix Matrix to use
 */
 public static function SumCol($col, $matrix) {
  $res = 0;
  foreach($matrix as $v) $res += $v[$col];
  return $res;
 }
 
 /**
 * Sum all values in matrix cells
 * @param array $matrix Matrix to use
 */
 public static function sumMatrix($matrix) {
  $res = 0;
  for($i = 0; $i < count($matrix); $i++)
   for($j = 0; $j < count($matrix); $j++)
    $res += $matrix[$i][$j];
  
  return $res;
 }
 
 /**
 * Check if adding a ($v1, $v2) cell will create any cycles for a given matrix
 * @param int $v1 Row to add
 * @param int $v2 Col to add
 * @param array $matrix Matrix
 */
 public static function isCycle($v1, $v2, $matrix) {
   $res = false;
   $v = $v2;
   $matrix[$v1][$v2] = 1;
   $CycLen = 1;
   $rib_count = TSP::sumMatrix($matrix);
    while ($CycLen < $rib_count+1) {
     $fl = false;
     for($i = 0; $i < count($matrix); $i++)
      if ($matrix[$v][$i] == 1) {
       $v = $i;
       $CycLen++;
       if ($v == $v1) return true;
       $fl = true;
      }
     if (!$fl) break;
    }
    return false;
 }
 
}

/**
* Processes script's output
*/
function output_best_result() {
 global $iterations, $start_time, $cities;

 $path = TSP::getBestPath(TSP::$best_path);
 foreach($path as &$v) $v++;

 foreach($path as $city)
  print($cities[$city-1][0]."\n");
 
 exit;
}

/**
* Calculate distance between 2 GPS points
* @param string $lat1 Latitude point 1
* @param string $lon1 Longitude point 1
* @param string $lat2 Latitude point 2
* @param string $lon2 Longitude point 2
*/
function distance($lat1, $lon1, $lat2, $lon2) {

 $theta = $lon1 - $lon2; 
 $dist = sin(deg2rad($lat1)) * sin(deg2rad($lat2)) +  cos(deg2rad($lat1)) * cos(deg2rad($lat2)) * cos(deg2rad($theta)); 
 $dist = acos($dist); 
 $dist = rad2deg($dist); 
 $miles = $dist * 60 * 1.1515;
 
 return $miles;
}


?>
Cities.txt
Beijing 39.93 116.40
Vladivostok 43.8 131.54
Dakar 14.40 -17.28
Singapore 1.14 103.55
Adams 42.6167 -73.1167
Bedford 42.4667 -71.4667
Beverly 42.5833 -70.9
Boston 42.3667 -71.0333
Brookline 42.3333 -71.1167
Cambridge 42.3667 -71.1
Chelmsford 42.7 -71.4
Chicopee 42.2 -72.5333
Clinton 42.4 -71.6833
Fitchburg 42.55 -71.75
Framingham 42.2833 -71.4167
Gloucester 42.5833 -70.6833
Greenfield 42.7 -72.0667
Haverhill 42.7833 -71.0833
Hyannis 41.6667 -70.2667
Lawrence 42.7037 -71.163
Lenox 42.35 -73.2833
Lowell 42.6333 -71.3167
Marshfield 42.1 -70.6833
Methuen 42.7333 -71.1833
Nantucket 41.2833 -70.1
Norwood 42.1833 -71.1667
Pittsfield 42.4333 -73.3
Plymouth 41.9 -70.7167
Provincetown 42.0667 -70.2167
Quincy 42.25 -71
Scituate 42.2 -70.7167
Taunton 41.9 -71.0667 

Friday, July 23, 2010

External Rewriting Program and RewriteLock

QUESTION:
I use External Rewriting Program in my apache config
RewriteEngine on
RewriteMap myMapProgram prg:/proxy/location.php

but the output of this script is messed up and apache gives weird errors. What's that about?

ANSWER:
When you use External Rewriting Program make sure to always specify RewriteLock directive.

In fact after adding RewriteMap prg: statement you'll notice that in your error logs apache warns that you should use RewriteLock directive:
Running external rewrite maps without defining a RewriteLock is DANGEROUS!

And there's at least one good reason to use RewriteLock.
If you don't do it, everything will seem to be fine until your live server starts getting much traffic. Concurrent apache hits result in calls to your External Rewriting Program which won't be able to normally synchronize it's output across apache threads.
The results of this are unpredictable - sometimes apache reads normal output from myMapProgram, but sometimes it is bullshit like 1 or 2 random symbols from output.

For example I started getting
"Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace."
error because incorrect output of my custom rewriting script caused wrong redirects in my code.

What's this blog about?

I'm a web developer and come across interesting problems and solutions all the time.
For me this blog is a place where I can share my solutions for specific problems with anyone else.
Usually problems described here are non-trivial and can't be simply googled.
Cheers!