Full Code of xme/pastemon for AI

master 0229f9db1efb cached
6 files
66.1 KB
25.5k tokens
1 requests
Download .txt
Repository: xme/pastemon
Branch: master
Commit: 0229f9db1efb
Files: 6
Total size: 66.1 KB

Directory structure:
gitextract_v29hyv1s/

├── README
├── pastemon.conf.sample
├── pastemon.pl
├── proxies.conf
├── regex.conf.sample
└── user-agents.conf

================================================
FILE CONTENTS
================================================

================================================
FILE: README
================================================
Introduction
------------

pastemon.pl is a script which runs in the background as a daemon and monitors pastebin.com for
interesting content (based on regular expressions). Found information is sent to syslog

The script can also generate (CEF events).

More information is available here: 
http://blog.rootshell.be/2012/01/17/monitoring-pastebin-com-within-your-siem/

v1.14 - 2012/10/31
------------------
- [FEATURE] Added SQLite DB support to store pasties details
  (some fields must still be implemented)

v1.13 - 2012/10/24
------------------
- [CONTRIBUTION] Added support for multiple SMTP recipients (email addresses separared by commas)
  Contribution from coreyroach@hotmail.com
- [CONTRIBUTION] Added a new macro-% to specify the site name in the dump function.
  '%S' will be replaced by the site name. Example: '%S/%Y/%M' => 'pastebin.com/2012/10'. 

v1.12 - 2012/09/20
------------------
- [BUGFIX] Fixed FuzzyMatch() which was broken with gziped pasties.
- [FEATURE] Email notification: The subject is now appended with the <description> field(s)
  corresponding to the matched regex(es). This allows a better view of received emails as well
  as filtering them.
- [BUGFIX] Fixed FuzzyMatch() to detect properly duplicate pasties (fixed regex).

v1.11 - 2012/09/14
------------------
- [FEATURE] Added support for nopaste.me.
- [FEATURE] Added support for pastesite.com.
- [FEATURE] Added configurable sleep delays per pastie website.

v1.10 - 2012/08/01
------------------
- [FEATURE] If not configuration file is specified, pastemon.pl tries to load /etc/pastemon.conf
  by default.

- [FEATURE] pastemon.pl uses specific Perl modules like WordPress::XMLRPC or Text::JaroWinkler.
  The script now handles properly environment without those modules. It's not required to comment
  them in the code. If a module is missing, related configuration is automatically disabled.

- [FEATURE] Added optional compression (via IO::Compress::Gzip) of dumped pasties.
  In configuration file:

	<core>
		<compress-pasties>yes</compress-pasties>
	</core>

v1.9 - 2012/07/23
-----------------
- [FEATURE] pastemon can now follow (search for regex) URLs detected in pasties. This is
  configured via the main configuration file:

	<urls>
		<follow>yes</follow>
		<matching>(bit\.ly)</matching>
	</urls>

- [FEATURE] The regex.conf format changed to an XML format.
  Examples:
	<regex>
		<search>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}</search>
		<count>10</count>
		<description>IP Address</description>
	</regex>

- [FEATURE] A minimum number of regex occurences can be defined to notify
  (<count> tag in the XML file)

- [FEATURE] HTTP requests are now using now a random User-Agent. 

- [BUGFIX] Optimized the detection of already processed pasties. This reduces the amount of HTTP 
  requests send to the website.

v1.8 - 2012/06/25
-----------------
- [FEATURE] Adder for pastie.org! 
- [FEATURE] Added multi-thread support (1 thread per website monitored)
- [FEATURE] Added substitution macro in the dump directory. Support macros are:
	%Y - replace with the current year
	%M - replace with the current month
	%D - replace with the current day
  Directory is automatically created.
  Example: /home/user/pastemon/%Y/%M/%D

- [FEATURE] Added a new configuration directive:
	<dump-all>yes|1</dump-all>
  This feature enables a dump of *ALL* pastie wheter they match a regex or not.
  This is similar to a mirror mode
  WARNING: Huge disk space might be required by this feature!
- [BUGFIX] Test if the provided SMTP server (for mail notifications) is available
  (Thanks to @manuelsubredu for the patch)
- [BUGFIX] Fixed an issue in createBlogPost() which caused an unexpected process exit.

v1.7 - 2012/05/11
-----------------
- Added support for "included" regular expressions
- Fixed in bug in getRegexDesc()
- Added support for comments ('#') in the regex configuration file
- Moved configuration parameters from command line switches to an XML file
- Added matching regex description in dump files 
- Added SMTP notifications
- Added distance check to detect duplicate pasties (using Jaro-Winkler algorithm)

v1.6 - 2012/02/21
-----------------
- Added a detection of "slow down" messages returned by Pastebin (add a small pause)
- Added support for Wordpress XMLRPC 
- Added support for random proxies
- Some bug fixes

v1.5 - 2012/02/19
-----------------
- Fixed the regex to grab pasties from the archive page. (HTML code changed)

v1.4 - 2012/02/15
-----------------
- Fixed a bug with CEF events: custome fields start at 1 not 0! (Thanks to Heiko Hansen for the report)
- Notify the presence of a proxy variable (HTTP_PROXY) 

v1.3 - 2012/01/26
-----------------
- Added a '--pidfile=file' configuration switch to specify an alternative location for the PID file.
  This allows the script to be executed with a non-root account.
- Added a '--sample=x' configuration to display a sample a data matching a regular expression. 'x' is 
  the number of bytes displayed before and after the matching string. This is useful to estimate the
  value of the pastie.  Example:
  Found in http://pastebin.com/raw.php?i=Q8pQRHKW : belgium (2 times) | Sample: g(0) ""\n  [32] => string(11) "Belgium(32)"\n  [31] => string(14) "Ne

v1.2 - 2012/01/21
-----------------
- Fixed a bug affecting the case sensitivity search
- New feature: an exception can be associated to a regular expression in the configuration file.
  The syntax is: "regex1 _EXCLUDE_ regex2". This could prevent some false positive matches.

v1.1 - 2012/01/20
-----------------
- Added a '--dump' configuration switch to save matching pasties in a directory.
  This is to keep the pasties posted with an expiration date (example: for later review)

v1.0 - 2012/01/18
-----------------
Initial release


================================================
FILE: pastemon.conf.sample
================================================
<!--
  pastemon.pl main configuration file sample
  Note: Features can be disabled by commenting them using standard comment tags.
//-->
<pastemon>
        <!-- Core features //-->
        <core>
                <ignore-case>yes</ignore-case>
                <pid-file>/var/run/pastemon.pid</pid-file>
                <regex-file>regex.conf</regex-file>
                <sample-size>256</sample-size>
                <proxy-config>proxies.conf</proxy-config>
		<ua-config>user-agents.conf</ua-config>
                <dump-directory>/home/pastemon/dump/%Y/%M/%D</dump-directory>
                <dump-all>yes</dump-all>
		<compress-pasties>yes</compress-pasties>
		<http-timeout>15</http-timeout>
		<!-- Use Jaro-Winkler distance algorithm //-->
		<distance-min>0.95</distance-min>
		<distance-max-size>10240</distance-max-size>
        </core>

        <!-- Websites to monitor //-->
        <websites>
                <pastebin>yes</pastebin>
		<pastebin-delay>10</pastebin-delay>
                <pastie>yes</pastie>
		<pastie-delay>120</pastie-delay>
		<nopaste>yes</nopaste>
		<nopaste-delay>300</nopaste-delay>
		<pastesite>yes</pastesite>
		<pastesite-delay>300</pastesite-delay>
        </websites>

        <!-- Follow URLs //-->
        <urls>
                <follow>yes</follow>
                <matching>(anonpaste|pastebin\.com|pastie\.org|pastehtml\.com|pastebay\.net|pastee\.org)</matching>
        </urls>

        <!-- CEF Output (ArcSight) //-->
        <cef-output>
                <destination>10.0.0.1</destination>
                <port>514</port>
                <severity>3</severity>
        </cef-output>

        <!-- Syslog Output //-->
        <syslog-output>
                <facility>daemon</facility>
        </syslog-output>

        <!-- Email Output //-->
        <smtp-output>
                <smtp-server>127.0.0.1</smtp-server>
                <from>pastemon@rootshell.be</from>
                <recipient>recipient@domain.com</recipient>
                <subject>PasteMon Alert</subject>
        </smtp-output>

        <!-- Wordpress Output (XMLRPC) //-->
        <wordpress-output>
                <site>www.myblog.com</site>
                <user>editor</user>
                <password>averystrongpassword</password>
                <category>favorite</category>
        </wordpress-output>

	<!-- SQLite Support //-->
	<db-output>
		<db-file>/home/pastemon/pastemon.db</db-file>
	</db-output>
</pastemon>


================================================
FILE: pastemon.pl
================================================
#!/usr/bin/perl
#
# pastemon.pl 
#
# This script runs in the background as a daemon and monitors pastebin.com for
# interesting content (based on regular expressions). Found information is sent
# to syslog
#
# This script is based on the Python script written by Xavier Garcia
# (http://www.shellguardians.com/2011/07/monitoring-pastebin-leaks.html)
#
# Copyright (c) 2012 Xavier Mertens
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. Neither the name of copyright holders nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL COPYRIGHT HOLDERS OR CONTRIBUTORS
# BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
# History
# -------
# See README file

use strict;
use threads;
use threads::shared;
use Digest::MD5 qw(md5 md5_hex md5_base64);
use File::Path; 
use Getopt::Long;
use IO::Socket;
use LWP::UserAgent;
use HTML::Entities;
use Sys::Syslog;
use Encode;
use XML::XPath;
use XML::XPath::XMLParser;
use Net::SMTP;
use POSIX qw(setsid);

# Optional modules
my $haveWordPressXMLRMC 	= eval "use WordPress::XMLRPC; 1";
my $haveTextJaroWinkler 	= eval "use Text::JaroWinkler qw(strcmp95); 1";
my $haveIOCompressGzip		= eval "use IO::Compress::Gzip; 1";
my $haveIOUncompressGunzip	= eval "use IO::Uncompress::Gunzip; 1";
my $haveDBI			= eval "use DBI; 1";

use constant PROCESS_URL	=> 1;
use constant PASTEBIN 		=> 0;	# Supported websites
use constant PASTIE 		=> 1;
use constant NOPASTE		=> 2;
use constant PASTESITE		=> 3;
my @webSiteNames = ( 			# Self-defined names for multiple usages
		"pastebin.com",
		"pastie.net",
		"nopaste.me",
		"pastesite.com",
	);

my $program = "pastemon.pl";
my $version = "v1.14";
my $debug;
my $help;
my $ignoreCase;		# By default respect case in strings search
my $cefDestination;	# Send CEF events to this destination:port
my $cefPort = 514;
my $cefSeverity = 3;
my $caught = 0;
my $httpTimeout = 10;	# Default HTTP timeout
my @pasties;
my @seenPasties;
my $maxPasties = 1000;	# TODO: Make it configurable?
my @regexList;		# List of interesting regex (with the data)
my $pidFile 	= "/var/run/pastemon.pid";
my $configFile	= "/etc/pastemon.conf";		# Main XML configuration file
my $regexFile;		# Regular expressions definitions
my $wpConfigFile;
my $proxyFile;
my @proxies;

my $uaFile;
my @uas;

my $wpSite;		# Wordpress settings
my $wpUser;
my $wpPass;
my $wpCategory;

my $smtpServer;		# SMTP settings
my $smtpFrom;
my $smtpRecipient;
my $smtpSubject;
my @smtpRecipients;

my $distanceMin;
my $distanceMaxSize;

my $followUrls;		# Follow URLs found in pastie
my $followMatching;

my $checkPastebin;	# Websites to monitor
my $checkPastie;
my $checkNopaste;
my $checkPastesite;

my $delayPastebin	= 300;	# Delays between pasties fetches
my $delayPastie		= 300;
my $delayNopaste	= 300;
my $delayPastesite	= 300;

my $syslogFacility = "daemon";
my $dumpDir;
my $dumpAll;
my $compressDump;
my $sampleSize;
my %matches;

my $dbFile;		# SQLite3 DB file

# Process arguments
my $result = GetOptions(
	"debug"			=> \$debug,
	"help"			=> \$help,
	"config=s"		=> \$configFile,
);

# TODO: Add a "--drop-sql-table" option to rebuild a fresh DB?
if ($help) {
	print <<__HELP__;
Usage: $0 --config=filepath [--debug] [--help]
Where:
--config : Specify the XML configuration file
--debug  : Enable debug mode (verbose - do not detach)
--help   : What you're reading now.
__HELP__
	exit 0;
}

parseXMLConfigFile($configFile);

($debug) && print STDERR "+++ Running in foreground.\n";

($cefDestination) && syslogOutput("Sending CEF events to $cefDestination:$cefPort (severity $cefSeverity)");

# Do not allow multiple running instances!
if (-r $pidFile) {
	open(PIDH, "<$pidFile") || die "Cannot read pid file!";
	my $currentpid = <PIDH>;
	close(PIDH);
	die "$program already running (PID $currentpid)";
}

loadRegexFromFile($regexFile) || die "Cannot load regex from file $regexFile";
loadUserAgentFromFile($uaFile) || die "Cannot load user-agents from file $uaFile";

if (!$debug) {
	my $pid = fork;
	die "Cannot fork" unless defined($pid);
	exit(0) if $pid;

	# We are the child
	(POSIX::setsid != -1) or die "setsid failed";
	chdir("/") || die "Cannot changed working directory to /";
	close(STDOUT);
	close(STDOUT);
	close(STDIN);
}

syslogOutput("Running with PID $$");
open(PIDH, ">$pidFile") || die "Cannot write PID file $pidFile: $!";
print PIDH "$$";
close(PIDH);

# Notify if HTTP proxy settings detected
if ($ENV{'HTTP_PROXY'}) {
	($proxyFile) && die "The HTTP_PROXY environment variable conflicts with the use of a proxies list";
	syslogOutput("Using detected HTTP proxy: " . $ENV{'HTTP_PROXY'});
}

my @threads;
my @webSites;
($checkPastebin) && push(@webSites, PASTEBIN);
($checkPastie) && push(@webSites, PASTIE);
($checkNopaste) && push(@webSites, NOPASTE);
($checkPastesite) && push(@webSites, PASTESITE);

# Launch threads based on the number of webistes to monitor
for my $webSite (@webSites) {
	my $t = threads->new(\&mainLoop, $webSite);
	push(@threads, $t);
}

$SIG{'TERM'}	= \&sigHandler;
$SIG{'INT'}	= \&sigHandler;
$SIG{'KILL'}	= \&sigHandler;
$SIG{'USR1'}	= sub {
			foreach my $t (@threads) {
				$t->kill('SIGUSR1');
			}
		      };

# Parent process just waiting for a signal
while(1) {
	sleep(1);
	if ($caught) {
		syslogOutput("Killing my threads");
		foreach my $t (@threads) {
			$t->kill('SIGKILL');
		}
	}
}

exit 0;

# ---------
# Main loop
# ---------
sub mainLoop {
	$SIG{'USR1'}	= \&sigReload; # Handle config reload
	$SIG{'KILL'}    = \&sigHandler;
	my $webSite = shift;
	while(1) {
		my $pastie;
		if (!&fetchLastPasties($webSite)) {
			foreach $pastie (@pasties) {
				exit 0 if ($caught == 1);
				analyzePastie($webSite, $pastie, PROCESS_URL);
			}
			exit 0 if ($caught == 1);
		}
		purgeOldPasties($maxPasties);

		# Wait some seconds (depending on the website)
		DELAY: {
			$webSite == PASTEBIN	&& do { 
							($debug) && print STDERR "Sleeping $delayPastebin\n"; 
							sleep($delayPastebin); last DELAY; };
			$webSite == PASTIE	&& do {
							($debug) && print STDERR "Sleeping $delayPastie\n";
							sleep($delayPastie); last DELAY; };
			$webSite == NOPASTE	&& do {
							($debug) && print STDERR "Sleeping $delayNopaste\n";
							sleep($delayNopaste); last DELAY; };
			$webSite == PASTESITE	&& do {
							($debug) && print STDERR "Sleeping $delayPastesite\n";
							sleep($delayPastesite); last DELAY; };
		}
	}
}

#
# analyzePastie
#
sub analyzePastie {
	my $webSite = shift;
	my $pastie = shift or return;
	my $processUrl = shift;
	my $regex;
	my $md5;
	if (!grep /$pastie/, @seenPasties) {
		my $content = fetchPastie($pastie);
		if ($content) {
			# If we receive a "slow down" message, follow Pastebin recommandation!
			if ($content =~ /Please slow down/) {
				($debug) &&  print STDERR "+++ Slow down message received. Paused 5 seconds\n";
				sleep(5);
			}
			else {
				# Compute the MD5 digest
				$md5 = md5_hex(encode('UTF8',$content));
				if (!dbSearchMD5($md5)) { 
					undef(%matches);	# Reset the matches regex/counters
					my $i = 0;
					my $regexSearch;
					my $regexInclude;
					my $regexExclude;
					my $regexDesc;
					my $regexCount;
					foreach $regex (@regexList) {
						$regexSearch	= @$regex[0];
						$regexInclude	= @$regex[1];
						$regexExclude	= @$regex[2];
						$regexDesc	= @$regex[3];
						$regexCount	= @$regex[4];
						my $sampleData;
						my ($startPos, $endPos);
						my $preCount = 0;
						if ($ignoreCase) {
							$preCount += () = $content =~ /$regexSearch/mgi;
							$startPos = $-[0];
							$endPos = $+[0];
						}
						else {
							$preCount += () = $content =~ /$regexSearch/mg;
							$startPos = $-[0];
							$endPos = $+[0];
						}
						if ($preCount >= $regexCount) {	
							if ($sampleSize) {
								# Optional: extract a sample of the data
								$startPos = (($startPos - $sampleSize) < 0) ? 0 : ($startPos - $sampleSize);
								$sampleData = encode('UTF8', substr($content, $startPos, ($endPos - $startPos) + $sampleSize));
							}
							# Process "include" regex defined
							if ($regexInclude ne "") {
								my $postCount = 0;
								if ($ignoreCase) {
									$postCount += () = $content =~ /$regexInclude/mgi;
								} else {
									$postCount += () = $content =~ /$regexInclude/mg;
								}
								if ($postCount) {
									# Matches for include $regex
									$matches{$i} = [ ( $regexSearch, $preCount, $sampleData ) ];
									$i++;
								}
							}
							elsif ($regexExclude ne "") {
								my $postCount = 0;
								if ($ignoreCase) {
									$postCount += () = $content =~ /$regexExclude/mgi;
								} else {
									$postCount += () = $content =~ /$regexExclude/mg;
								}
								if (! $postCount) {
									# Matches for exclude $regex
									$matches{$i} = [ ( $regexSearch, $preCount, $sampleData ) ];
									$i++;
								}
							}
							else {
								$matches{$i} = [ ( $regexSearch, $preCount, $sampleData ) ];
								$i++;
							}
						}
					}
					if ($followUrls && $processUrl) {
						$i += processUrls($content);
					}
					if ($i) {
						# Try to find a corresponding pastie?
						if (!FuzzyMatch($webSite, $content))
						{
							# Generate the results based on matches
							my $buffer = "Found in " . $pastie . " : ";
							my $key;
							for $key (keys %matches) {
								$buffer = $buffer . $matches{$key}[0] . " (" . $matches{$key}[1] . " times) ";
							}
							if ($sampleSize) {
								# Optional: Add sample of data
								my $safeData = $matches{0}[2];
								# Sanitize the data
								$safeData =~ s/
//g;
								$safeData =~ s/\n/\\r/g;
								$safeData =~ s/\n/\\n/g;
								$safeData =~ s/\t/\\t/g;
								$buffer = $buffer . "| Sample: " . $safeData;
							}
							syslogOutput($buffer);
	
							# Generating CEF event (if configured)
							($cefDestination) && sendCEFEvent($pastie);
		
							# Generating blog post (if configured)
							($wpSite) && createBlogPost($pastie);
	
							# Send SMTP notification (if configured)
							if ($smtpServer) {
								my $smtp = Net::SMTP->new($smtpServer) or die "Cannot create SMTP connection to $smtpServer: $?";
								$smtp->mail($smtpFrom);
								$smtp->recipient(@smtpRecipients, { SkipBad => 1});
								$smtp->data();
								my $subjectTags;
								for $key (keys %matches) {
									my $tempDesc = getRegexDesc($matches{$key}[0]);
	                                                                if (length($tempDesc) > 0) {
	                                                                        $subjectTags = $subjectTags . '(' . getRegexDesc($matches{$key}[0]) . ') ';
	                                                                }
								}
								my $smtpBody = "To: $smtpRecipient\nSubject: $smtpSubject $subjectTags\n\n";
								for $key (keys %matches) {
									$smtpBody = $smtpBody . "Matched: " . $matches{$key}[0] . " (" . $matches{$key}[1] . " time(s))\n";
								}
								$smtpBody = $smtpBody . "\nSource: " . $pastie . "\n\n" . $content;
								$smtp->datasend($smtpBody);
								$smtp->dataend();
								$smtp->quit();
							}
	
							# Save pastie content in the dump directory (if configured)
							if ($dumpDir) {
								my $tempPastie = getPastieID($pastie);
								my $tempDir = validateDumpDir($webSite, $dumpDir); # Generate and create dump directory
								(-d $tempDir) or die "Cannot validate directory $dumpDir: $!";
								open(DUMP, ">:encoding(UTF-8)", "$tempDir/$tempPastie.raw") or die "Cannot write to $tempDir/$tempPastie.raw : $!";
								for $key (keys %matches) {
									print DUMP "Matched: " . $matches{$key}[0] . " (" . $matches{$key}[1] . " time(s))\n";
								}
								print DUMP "\n$content";
								close(DUMP);
								if ($compressDump) { # Compress pastie
									my $in  = "$tempDir/$tempPastie.raw";
									my $out = "$tempDir/$tempPastie.gz";
									use IO::Compress::Gzip qw(gzip);
									if (gzip $in => $out) {
										unlink("$tempDir/$tempPastie.raw");
									}
									else {
										syslogOutput("Cannot compress $tempDir/$tempPastie.raw: $!");
									}
								}
							}
	
						}
					}
					elsif ($dumpAll && $dumpDir) {
						# Mirroring mode - dump the pastie in all cases
						my $tempPastie = getPastieID($pastie);
						my $tempDir = validateDumpDir($webSite, $dumpDir);
						(-d $tempDir) or die "Cannot validate directory $tempDir: $!";
						open(DUMP, ">:encoding(UTF-8)", "$tempDir/$tempPastie.raw") or die "Cannot write to $tempDir/$tempPastie.raw : $!";
						print DUMP "\n$content";
						close(DUMP);
						if ($compressDump) { # Compress pastie
							my $in  = "$tempDir/$tempPastie.raw";
							my $out = "$tempDir/$tempPastie.gz";
							use IO::Compress::Gzip qw(gzip);
							if (gzip $in => $out) {
								unlink("$tempDir/$tempPastie.raw");
							}
							else { 
								syslogOutput("Cannot compress $tempDir/$tempPastie.raw: $!");
							}
						}
					}
	
					# Flag this pastie as "seen"
					push(@seenPasties, $pastie);
	
					# Save pastie data in SQLite
					if ($dbFile) {
						dbSavePastie($pastie, $md5);
					}

					# Wait a random number of seconds to not mess with pastebin.com webmasters
					sleep(int(rand(5)));
				}
				else { # MD5 Exists in DB
					($debug) && print "DEBUG: MD5 $md5 already found in DB!\n";
				}
			}
		}
	}
}

#
# Search for interesting data in URLs found inside the pastie
#
sub processUrls {
	my $pastie = shift || return 0;
	while ($pastie =~ m,(http.*?://([^\s)\"](?!ttp:))+),g) { # "
		my $url = $&;
		if ($url =~ /$followMatching/gi) { #Process only URLs matching our regex!
                	($debug) && print "+++ Following URL: $url\n";
			my $ua = LWP::UserAgent->new;
			$ua->agent(getRandomUA());
			my $r = $ua->head("$url");
			if ($r->is_success && substr($r->header('Content-Type'), 0, 5) eq "text/") {	# Only process "text"
				analyzePastie($url);
			}
        	}
		# Protect us against pastebin.com blacklist?
		#sleep(int(rand(15)));
	}
	return 0;
}

# 
# parseXMLConfigFile
# Load the configuration from provided XML file
# Args:
# $configFile = Main pastemon.conf XML file
#
sub parseXMLConfigFile {
	my $configFile = shift;
	(-r $configFile) || die "Cannot load XML file $configFile: $!";

	($debug) && print STDERR "+++ Loading XML file $configFile.\n";
	my $xml = XML::XPath->new(filename => "$configFile");
	my $buff;

	# Reset settings
	undef $pidFile;
	undef $sampleSize;
	undef $dumpDir;
	undef $dumpAll;
	undef $compressDump;
	undef $proxyFile;
	undef $uaFile;
	undef $cefDestination;
	undef $cefPort;
	undef $cefSeverity;
	undef $smtpServer;
	undef $smtpFrom;
	undef $smtpRecipient;
	undef $smtpSubject;
	undef $wpSite;
	undef $wpUser;
	undef $wpPass;
	undef $wpCategory;
	undef $distanceMin;
	undef $distanceMaxSize;
	undef $checkPastebin;
	undef $checkPastie;
	undef $checkNopaste;
        undef $checkPastesite;
	undef $followUrls;
	undef $followMatching;
	undef $dbFile;

	# Core Parameters
	my $nodes = $xml->find('/pastemon/core');
	foreach my $node ($nodes->get_nodelist) {
		$buff			= $node->find('ignore-case')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$ignoreCase++;
			($debug) && print STDERR "+++ Non-sensitive search enabled.\n";
		}
		$buff			= $node->find('dump-all')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$dumpAll++;
			($debug) && print STDERR "+++ Dumping all pasties (mirror mode).\n";
		}
		$buff			= $node->find('compress-pasties')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$compressDump++;
			($debug) && print STDERR "+++ Compressing all pasties (mirror mode).\n";
		}
		$pidFile		= $node->find('pid-file')->string_value;
		$regexFile		= $node->find('regex-file')->string_value;
		$sampleSize		= $node->find('sample-size')->string_value;
		$dumpDir		= $node->find('dump-directory')->string_value;
		$proxyFile		= $node->find('proxy-config')->string_value;
		$uaFile			= $node->find('ua-config')->string_value;
		$httpTimeout		= $node->find('http-timeout')->string_value;
		$distanceMin		= $node->find('distance-min')->string_value;
		$distanceMaxSize	= $node->find('distance-max-size')->string_value;
	}

	# Monitored websites
	my $nodes = $xml->find('/pastemon/websites');
	foreach my $node ($nodes->get_nodelist) {
		$buff			= $node->find('pastebin')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$checkPastebin++;
			($debug) && print STDERR "+++ pastebin.com monitoring activated.\n";
		}
		$buff                   = $node->find('pastie')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$checkPastie++;
			($debug) && print STDERR "+++ pastie.com monitoring activated.\n";
		}
		$buff                   = $node->find('nopaste')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$checkNopaste++;
			($debug) && print STDERR "+++ nopaste.me monitoring activated.\n";
		}
		$buff			= $node->find('pastesite')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$checkPastesite++;
			($debug) && print STDERR "+++ pastesite.com monitoring activated.\n";
		}
		$delayPastebin	= $node->find('pastebin-delay')->string_value;
		$delayPastie	= $node->find('pastie-delay')->string_value;
		$delayNopaste	= $node->find('nopaste-delay')->string_value;
		$delayPastesite	= $node->find('pastesite-delay')->string_value;
	}

	# Follow URLs
	my $nodes = $xml->find('/pastemon/urls');
	foreach my $node ($nodes->get_nodelist) {
		$buff			= $node->find('follow')->string_value;
		if (lc($buff) eq "yes" || $buff eq "1") {
			$followUrls++;
			($debug) && print STDERR "+++ Follow URLs feature activated.\n";
		}
		$followMatching		= $node->find('matching')->string_value;
	}

	# CEF Parameters
	my $nodes = $xml->find('/pastemon/cef-output');
	foreach my $node ($nodes->get_nodelist) {
		$cefDestination		= $node->find('destination')->string_value;
		$cefPort		= $node->find('port')->string_value;
		$cefSeverity		= $node->find('severity')->string_value;
	}

	# Syslog Parameters
	my $nodes = $xml->find('/pastemon/syslog-output');
	foreach my $node ($nodes->get_nodelist) {
		$syslogFacility		= $node->find('facility')->string_value;
	}

	# Wordpress Parameters
	my $nodes = $xml->find('/pastemon/wordpress-output');
	foreach my $node ($nodes->get_nodelist) {
		$wpSite			= $node->find('site')->string_value;
		$wpUser			= $node->find('user')->string_value;
		$wpPass			= $node->find('password')->string_value;
		$wpCategory		= $node->find('category')->string_value;
	}

	# SMTP Parameters
	my $nodes = $xml->find('/pastemon/smtp-output');
	foreach my $node ($nodes->get_nodelist) {
		$smtpServer		= $node->find('smtp-server')->string_value;
		$smtpFrom		= $node->find('from')->string_value;
		$smtpRecipient		= $node->find('recipient')->string_value;
		$smtpSubject		= $node->find('subject')->string_value;
	}

	# SQLite3 Parameters
	my $nodes = $xml->find('/pastemon/db-output');
	foreach my $node ($nodes->get_nodelist) {
		$dbFile			= $node->find('db-file')->string_value;
	}

	# ---------------------
	# Parameters validation
	# ---------------------

	# Check if the provided dump directory is writable to us
	if ($dumpDir) {
		# (-w $dumpDir) or die "Directory $dumpDir is not writable: $!";
		syslogOutput("Using $dumpDir as dump directory");
	}

	# Compress dumped pasties?
	if ($compressDump) {
		if ($haveIOCompressGzip) { # Module IO::Compress::Gzip installed?
			if (!$dumpDir) {
				syslogOutput("Option compress-pasties disabled: No dump directory defined");
				undef $compressDump
			}
			if (!$haveIOUncompressGunzip) { # Module IO::Compress::Gunzp installed?
				syslogOutput("Option compress-pasties disabled: IO::Uncompress:Gunzip not installed");
				undef $compressDump;
			}
		}
		else {
			syslogOutput("Option compress-pasties disabled: IO::Compress:Gzip not installed");
			undef $compressDump;
		}
	}

	# Dumping all pasties requires a dump directory
	if ($dumpAll && !$dumpDir) {
		syslogOutput("No dump directory specified");
	}

	# Verifiy sampleSize format if specified
	if ($sampleSize) {
		die "Sample buffer length must be an integer!" if not $sampleSize =~ /\d+/;
		syslogOutput("Dumping $sampleSize bytes samples");
	}

	# Verify the HTTP timeout if specified
	if ($httpTimeout) {
		die "HTTP timeout must be an integer!" if not $httpTimeout =~ /\d+/;
		syslogOutput("HTTP timeout: $httpTimeout seconds");
	}

	# Verify Wordpress config
	if ($wpSite) {
		if ($haveWordPressXMLRMC) { # Module WordPress::XMLRPC installed?
			(!$wpSite || !$wpUser || !$wpPass || !$wpCategory) && die "Incomplete Wordpress configuration";
			($sampleSize) || die "A sample buffer length must be given with Wordpress output";
			syslogOutput("Dumping data to $wpSite/xmlrpc.php");
		} 
		else {
			syslogOutput("Wordpress configuration disabled: Wordpress::XMLRPC not installed");
			undef $wpSite;
		}
	}

	# Verify SMTP config
	if ($smtpServer) {
		(!$smtpServer || !$smtpFrom || !$smtpRecipient || !$smtpSubject) && die "Incomplete SMTP configuration";
		my $smtp = Net::SMTP->new($smtpServer) or die "Cannot use SMTP server $smtpServer: $?";
		$smtp->quit();
		@smtpRecipients = split(/[, ]+/, $smtpRecipient);
		syslogOutput("Sending SMTP notifications to <".$smtpRecipient.">");
	}

	# Load proxies
	if ($proxyFile) {
		(-r $proxyFile) or die "Cannot read proxy configuration file $proxyFile: $!";
		loadProxyFromFile($proxyFile) || die "Cannot load proxies from file $proxyFile";
	}

	# Distance
	if ($distanceMin) {
		if ($haveTextJaroWinkler) { # Module Text::JaroWinkler installed?
			(!$dumpDir) && die "A dump directory must be configured to use the distance check";
			($distanceMin > 0 && $distanceMin < 1) or die "Minimum distance must be between 0 and 1";
			if ($distanceMaxSize) {
				die "Distance max size must be an integer!" if not $distanceMaxSize =~ /\d+/;
				syslogOutput("Enabled duplicate detection with distance of $distanceMin (size limit: $distanceMaxSize bytes)");
			} else {
				syslogOutput("Enabled duplicate detection with distance of $distanceMin");
			}
		}
		else {
			syslogOutput("Distance configuration disabled: Text::JaroWinkler not installed");
			undef $distanceMin;
		}
	}

	# SQLite3 Output
	if ($dbFile) {
		if ($haveDBI) { # Module DBI installed?
			# Do we have to initialize the DB (first execution)
			my $dbh = DBI->connect("dbi:SQLite:dbname=" . $dbFile)
				or die "Cannot connect to the SQLite DB " . $dbFile . "\n";
			my $sth = $dbh->prepare("SELECT name FROM sqlite_master WHERE type='table' AND name='pasties'");
			$sth->execute();
			my $data = $sth->fetch();
			if (!$data) {	# Tables 'pasties' does not exists. Create it.
				$sth = $dbh->prepare("CREATE TABLE pasties (id VARCHAR(50),
										timestamp DATETIME,
										url VARCHAR(128),
										matched VARCHAR(256),
										path VARCHAR(256),
										md5 VARCHAR(32) PRIMARY KEY,
										type INTEGER)");
				$sth->execute() or die "Cannot create table 'pasties'";
				$sth = $dbh->prepare("CREATE UNIQUE INDEX pasties_idx ON pasties(id)");
				$sth->execute() or die "Cannot create index 'pasties_idx'";
				($debug) && print STDERR "+++ Created database " . $dbFile . "\n";
			}
			$dbh->disconnect();
		}
		else {
			syslogOutput("DB support disabled: DBI not installed");
			undef $dbFile;
		}
	}

	# Follow URL
	if ($followUrls && !$followMatching) {
		syslogOutput("Warning: No regex defined to match URLs");
		$followMatching = ".*";	# Match everything
	}

	return;
}

#
# Download the latest pasties and load them in a Perl array
# (http://pastebin.com/archive)
#
sub fetchLastPasties {
	my $webSite = shift;
	my $tempProxy;
	my $ua = LWP::UserAgent->new;
	$ua->timeout($httpTimeout);
	if (@proxies) {
		$tempProxy = selectRandomProxy();
		$ua->proxy('http', $tempProxy);
	}
	else {
		($ENV{'HTTP_PROXY'}) && $ua->env_proxy;
	}
	$ua->agent(getRandomUA());

	undef @pasties;	# Reset the array first!

	# www.pastebin.com
	if ($webSite == PASTEBIN) {
		($debug) && print STDERR "Loading new pasties from pastebin.com.\n";
		my $response = $ua->get("http://pastebin.com/archive");
		if ($response->is_success) {
			# Load the pasties into an array
			# @pasties = $response->decoded_content =~ /<td class=\"icon\"><a href=\"\/(\w+)\">.+<\/a><\/td>/g;
			# New format (2012/02/19):
			my @tempPasties = $response->decoded_content =~ /<a href=\"\/(\w{8})\">.+<\/a><\/td>/g;
			# Append the complete URL
			foreach my $p (@tempPasties) {
				$p = 'http://pastebin.com/raw.php?i=' . $p;
			}
			push(@pasties, @tempPasties);
		}
		else {
			syslogOutput("Cannot fetch www.pastebin.com: " . $response->status_line);
			# If cannot fetch pastie and we use proxies, disable the current one!
			(@proxies) && disableProxy($tempProxy);
			return 1;
		}
	}
	elsif ($webSite == PASTIE) {
		#($debug) && print STDERR "Loading new pasties from pastie.org.\n";
		my $response = $ua->get("http://pastie.org/pastes");
		if ($response->is_success) {
			my @tempPasties = $response->decoded_content =~ /<a href=\"(http:\/\/pastie.org\/pastes\/\d{7})\">/g;
			# Append the complete URL
			foreach my $p (@tempPasties) {
				$p = $p . '/download';
			}
			push(@pasties, @tempPasties);
		}
		else {
			syslogOutput("Cannot fetch www.pastie.org: " . $response->status_line);
			# If cannot fetch pastie and we use proxies, disable the current one!
			(@proxies) && disableProxy($tempProxy);
			return 1;
		}
	}
	elsif ($webSite == NOPASTE) {
		#($debug) && print STDERR "Loading new pasties from nopaste.me.\n";
		my $response = $ua->get("http://nopaste.me/recent");
		if ($response->is_success) {
			my @tempPasties = $response->decoded_content =~ /<a href=\"http:\/\/nopaste.me\/paste\/([a-z0-9]+)\">/ig;
			# Append the complete URL
			foreach my $p (@tempPasties) {
				$p = 'http://nopaste.me/raw/' . $p . '.txt';
			}
			push(@pasties, @tempPasties);
		}
		else {
			syslogOutput("Cannot fetch nopaste.me: " . $response->status_line);
			# If cannot fetch pastie and we use proxies, disable the current one!
			(@proxies) && disableProxy($tempProxy);
			return 1;
		}
	}
	elsif ($webSite == PASTESITE) {
		($debug) && print STDERR "Loading new pasties from pastesite.com.\n";
		my $response = $ua->get("http://pastesite.com/recent");
		if ($response->is_success) {
			my @tempPasties = $response->decoded_content =~ /<a href=\"(\d+)\" title=\"View this Paste/ig;
			# Append the complete URL
			foreach my $p (@tempPasties) {
				$p = 'http://pastesite.com/' . $p;
			}
			push(@pasties, @tempPasties);
		}
		else {
			syslogOutput("Cannot fetch pastesite.com: " . $response->status_line);
			# If cannot fetch pastie and we use proxies, disable the current one!
			(@proxies) && disableProxy($tempProxy);
			return 1;
		}
	}
 	else {
		die "Unknown website constant: $webSite";
	}

	# DEBUG
	#foreach my $p (@pasties) {
	#	print "DEBUG: $p\n";
	#}
	return 0;
}

#
# Fetch the raw content of a pastie and return its content
#
sub fetchPastie {
	my $tempProxy;
	my $pastie = shift;
	my $ua = LWP::UserAgent->new;
	$ua->timeout($httpTimeout);
	if (@proxies) {
		$tempProxy = selectRandomProxy();
		$ua->proxy('http', $tempProxy);
	}
	else {
		($ENV{'HTTP_PROXY'}) && $ua->env_proxy;
	}
	$ua->agent(getRandomUA());
	my $response = $ua->get("$pastie");
	if ($response->is_success) {
		# Hack for pastesite.com: Extract data from the <textarea> </textarea>
		# (To bypass the <continue> button)
		if ($pastie =~ /http:\/\/pastesite.com/) {
			if ($response->decoded_content =~ /\<textarea .*\>(.*)\<\/textarea\>/igs) {
				my $pastesiteContent = $1;
				return $pastesiteContent;
			}
		}
		else {
			return $response->decoded_content;
		}
	}
	($debug) &&  print STDERR "+++ Cannot fetch pastie $pastie: " . $response->status_line . "\n";

	# If cannot fetch pastie and we use proxies, disable the current one!
	(@proxies) && disableProxy($tempProxy);
	return "";
}

#
# Load the regular expressions from the configuration file to a Perl array
#
sub loadRegexFromFile {
	my $file = shift;
	die "A configuration file is required" unless defined($file);
	undef @regexList; # Clean up array (if reloaded via SIGUSR1
	( -r "$file") || die "Cannot open file $file: $!";
	my $xp = XML::XPath->new( filename => "$file");
	my $ns = $xp->find('/config/regex');
	foreach my $n ($ns->get_nodelist) {
		my @r;
		push(@r,	$n->find('search')->string_value);
		push(@r,	$n->find('include')->string_value);
		push(@r,	$n->find('exclude')->string_value);
		push(@r,	$n->find('description')->string_value);
		if ($n->find('count')->string_value ne "") {
			push(@r,$n->find('count')->string_value);
		} else {
			push(@r, "1");
		}
		push(@regexList, [ @r ]);
	}
	syslogOutput("Loaded " . @regexList . " regular expressions from " . $file);
	return(1);
}

#
# Load proxies from the configuration file
#
sub loadProxyFromFile {
	my $file = shift;
	return(1) unless defined($file);
	open(PROXY_FD, "$file") || die "Cannot open file $file : $!";
	while(<PROXY_FD>) {
		chomp;
		(length > 0) && push(@proxies, 'http://'.$_);
	}
	close(PROXY_FD);
	(@proxies) || die "No proxies read from $file";
	syslogOutput("Loaded " . @proxies . " proxies from " . $file);
	return(1);
}

#
# Return a random proxy from the loaded list
#
sub selectRandomProxy {
	my $randomIdx = rand($#proxies);
	# ($debug) && print STDERR "+++ Using proxy: " . $proxies[$randomIdx] . "\n";
	return $proxies[$randomIdx];
}

#
# Remove a faulty proxy from the proxies array
#
sub disableProxy {
	my $badProxy = shift;
	return unless defined($badProxy);
	my $p;
	my $i = 0;
	foreach $p (@proxies) {
		$i++;
		if ($p eq $badProxy) { last; }
	}
	# delete $proxies[$i]; -- DEPRECATED
	splice @proxies, $i, 1;
	syslogOutput("Disabled unreliable proxy " . $badProxy . " (" . @proxies . ' active proxies)');
}

sub purgeOldPasties {
	my $max = shift;
	while (@seenPasties > $max) {
		#delete $seenPasties[0]; -- DEPRECATED
		splice @seenPasties, 0, 1;
	}	
	return;
}

#
# Handle a proper process cleanup when a signal is received
#
sub sigHandler {
	syslogOutput("Received signal. Exiting.");
	unlink($pidFile) if (-r $pidFile);
	$caught = 1;
}

#
# Reload configuration files
#
sub sigReload {
	syslogOutput("Reloading config files (Thread ID " . threads->tid() . ")");
	parseXMLConfigFile($configFile);
	loadRegexFromFile($regexFile);
	(@proxies) && loadProxyFromFile($proxyFile);
	return;
}

#
# Send Syslog message using the defined facility
#
sub syslogOutput {
        my $msg = shift or return(0);
	if ($debug) {
		print STDERR "+++ $msg\n";
	}
	else {
		openlog($program, 'pid', $syslogFacility);
		syslog('info', '%s', $msg);
		closelog();
	}
}

#
# Send a CEF syslog packet to an ArcSight device/application
#
sub sendCEFEvent {
	my $pastie = shift;
	# Syslog data format must be "Jul 10 10:11:23"
	my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
	my @months = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec");
	my $timeStamp = sprintf("%3s %2d %02d:%02d:%02d", $months[$mon], $mday, $hour, $min, $sec);
	my $buffer = sprintf("<%d>%s CEF:0|%s|%s|%s|regex-match|One or more regex matched|%d|request=%s destinationDnsDomain=pastebin.com msg=Interesting data has been found on pastebin.com. ", 
			29,
			$timeStamp,
			"blog.rootshell.be",
			$program,
			$version,
			$cefSeverity,
			$pastie
	);
 	my $key;
	my $i = 1;
	for $key (keys %matches) {
		$buffer = $buffer . "cs" . $i . "=" . $matches{$key}[0] . " cs" . $i . "Label=Regex". $i . "Name cn" . $i . "=" . $matches{$key}[1]. " cn" . $i . "Label=Regex" . $i . "Count ";
		if (++$i > 6) {
			syslogOutput("Maximum 6 matching regex can be logged");
			last;
		}
	}

	# Ready to send the packet!
	my $sock = new IO::Socket::INET(PeerAddr => $cefDestination,
					PeerPort => $cefPort,
					Proto => 'udp',
					Timeout => 1) or die 'Could not create socket: $!';
	$sock->send($buffer) or die "Send UDP packet error: $!";
}

#
# Return the description corresponding to a regex
#
sub getRegexDesc {
	my $regex = shift;
	return unless defined($regex);
	foreach my $r (@regexList) {
		if ($regex eq @$r[0]) {
			return(@$r[3]);
		}
	}
	return;
}

#
# Insert a new pastie in SQLite DB 
#
sub dbSavePastie {
	my $pastie = shift or return;
	my $md5 = shift or return;
	my $id= getPastieID($pastie);
	my $dbh = DBI->connect("dbi:SQLite:dbname=" . $dbFile)
			or die "Cannot connect to the SQLite DB " . $dbFile . "\n";
	my $sth = $dbh->prepare("INSERT INTO pasties VALUES(
					\"$id\",
					DATETIME('now'),
					\"$pastie\",
					\"\",
					\"\",
					\"$md5\",
					0)");
	if (!$sth->execute()) {
		syslogOutput("Cannot insert MD5 " . $md5 . " in DB: " . $sth->errstr() . " (Pastie: " . $pastie . ")");
	}
	$dbh->disconnect();
	return;
}

#
# Search for a pastie MD5 in SQLite DB
#
sub dbSearchMD5 {
	my $md5 = shift or return;
	my $dbh = DBI->connect("dbi:SQLite:dbname=" . $dbFile)
			or die "Cannot connect to the SQLite DB " . $dbFile . "\n";
	my $sth = $dbh->prepare("SELECT md5 FROM pasties WHERE md5 = \"$md5\"");
	$sth->execute();
	$sth->fetchrow();
	return ($sth->rows() > 0) ? 1 : 0;
}

#
# Create a Wordpress blog post
#
sub createBlogPost {
	my $pastie = shift;
	
	my $key;
	my $title;
	my $buffer;
	my $tags = "";

	# Generate tag based on the URL
	if ($pastie =~ /pastebin\.com/) {
		$tags = 'pastebin.com,';
	}
	elsif ($pastie =~ /pastie\.org/) {
		$tags = 'pastie.org,';
	}
	elsif ($pastie =~ /nopaste\.me/) {
		$tags = 'nopaste.me,';
	}
	elsif($pastie =~ /pastesite\.com/) {
		$tags = 'pastesite.com,';
	}

	for $key (keys %matches) {
		if (!$title) {
			$title = 'Potential leak of data: ' . getRegexDesc($matches{$key}[0]);
		}
		$buffer = $buffer . 'Detected ' . $matches{$key}[1] . ' occurrence(s) of \'' . $matches{$key}[0] . '\':<br>';
		$buffer = $buffer . '<pre>' . encode_entities($matches{$key}[2]) . '</pre><p>';
		# Populate Wordpress tags
		$tags = $tags . getRegexDesc($matches{$key}[0]) . ',';
	}
	$buffer = $buffer . 'Source: <a href="' . $pastie . '">' . $pastie . '</a><br>';
	# Prepare the XML request
	my $o = WordPress::XMLRPC->new;
	$o->username($wpUser);
	$o->password($wpPass);
	$o->proxy('http://' . $wpSite . '/xmlrpc.php');
	if (!$o->server()) {
		syslogOutput("Cannot connect to the Wordpress blog");
		return;
	}

	my $hashref = {
		'title' 		=> $title,
		'categories'		=> [ $wpCategory ],
		'description'		=> $buffer,
		'mt_keywords'		=> $tags,
		'mt_allow_comments'	=> 0,
	};
	# WordPress::XMLRPC does not handle exceptions properly.
	# Eval will catch runtime errors or die() and report the
	# error properly (into $@)
	my $ret = eval {
		my $ID = $o->newPost($hashref, 1);
	};
	if (!$ret) {
		syslogOutput("Cannot post Wordpress article: $@");
	}
	return;
}

#
# Perl trim function to remove whitespace from the start and end of the string
#
sub trim($) {
	my $string = shift;
	$string =~ s/^\s+//;
	$string =~ s/\s+$//;
	return $string;
}

#
# Compare a pastie to the already loaded ones using the Jaro Winkler algorithm
# See http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
#
sub FuzzyMatch {
	my $webSite = shift;
	my $newContent = shift;
	my $timeIn = time();

	# Is this feature enabled?
	(!$distanceMin) && return 0;

	# A dump directory must be configured!
	(!$newContent || !$dumpDir) && return 0;

	# Ignore content if size above configured limit (performance)
	(length($newContent) > $distanceMaxSize) && return 0;

	foreach my $pastie (@seenPasties) {
		my $tempPastie = getPastieID($pastie);
		my $tempDir = validateDumpDir($webSite, $dumpDir);
		my $buffer = "";
		# Do we have compression enabled?
		if ($compressDump) {
			# Uncompress in the fly
			my $in = "$tempDir/$tempPastie.gz";
			if (-r $in) {
				use IO::Uncompress::Gunzip qw(gunzip);
				gunzip $in => \$buffer or die "Cannot uncompress $tempDir/$tempPastie.gz";
				($debug) && print STDERR "+++ Uncompressed $in : " . length($buffer) . " bytes\n";
			}
		}
		else {
			# Read the plain text file
			if (open(FD, "$tempDir/$tempPastie.raw")) {
				$buffer = do { local $/; <FD> };
				close(FD);
			}
		}
		if (length($buffer) > 0) {
			# Remove the 2 first lines
			# Bug: Remove ALL lines starting with Matched (multiple regex)
 			$buffer =~ /^(Matched: .*\n)+\n(.*)/s;
			$buffer = $1;
			if (length($buffer) > 0) { # Bug fix 2012/07/16: Only process "matched" pasties!
				my $distance = strcmp95($newContent, $buffer, length($newContent), TOUPPER => 1, HIGH_PROB => 0);
				if ($distance > $distanceMin) {
					syslogOutput("Potential duplicate content found with pastie $pastie (distance: $distance)");
					return 1;
				}
			}
		}
	}
	my $timeOut = time();
	$timeOut -= $timeIn;
	($debug) && print STDERR "+++ Time: " . $timeOut . "\n";
	return 0;
}

#
# Build the dump directory based on macro and create it
#
sub validateDumpDir {
	my $webSite = shift;
	my $dir = shift;
	(!$dir) && return "";

	# Replace macro-% by correct values. Supported:
	# %Y : Year
	# %M : Month
	# %D : Day
	# %H : Hour
	# %S : Site name
	my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
	$year+=1900;
	$mon  = sprintf("%02d", ++$mon);
	$mday = sprintf("%02d", $mday);
	$hour = sprintf("%02d", $hour);
	$dir =~ s/\%Y/$year/g;
	$dir =~ s/\%M/$mon/g;
	$dir =~ s/\%D/$mday/g;
	$dir =~ s/\%H/$hour/g;
	$dir =~ s/\%S/$webSiteNames[$webSite]/g;
	if (!(-d $dir)) {
		if (!mkpath("$dir")) {
			# If mkpath() failed, re-check the directory
			# (Could have been created by another threat!
			(-d $dir) && return $dir;
			syslogOutput("mkdir(\"$dir\") failed: $!");
			return "";
		}
	}
	return $dir;
}

# 
# Extract the pastie from an URL:
# pastebin.com: pastebin.com/raw.php?i=(XXX)
# pastie.org: pastie.org/pastes/(XXX)/download
# pastesite.com:
#
sub getPastieID {
	my $pastie = shift or return "";
	if ($pastie =~ /pastebin\.com\/raw\.php\?i=(\w+)/) {
		return $1;
	}
	if ($pastie =~ /pastie\.org\/pastes\/(\d+)\/download/) {
		return $1;
	}
	if ($pastie =~ /pastesite.com\/(\d+)/) {
		return $1;
	}
	if ($pastie =~ /nopaste.me\/raw\/(\w+)/) {
		return $1;
	}
	return "";
}

#
# Load User-Agents from file
#
sub loadUserAgentFromFile {
	my $file = shift;
	return(1) unless defined($file);
	open(UA_FD, "$file") || die "Cannot open file $file : $!";
	while(<UA_FD>) {
		chomp;
		(length > 0) && push(@uas, $_);
	}
	close(UA_FD);
	(@uas) || die "No User-Agent read from $file";
	syslogOutput("Loaded " . @uas . " User-Agent from " . $file);
	return(1);
}

#
# Return a random User-Agent from the loaded list
#
sub getRandomUA {
	my $randomIdx = rand($#uas);
	return $uas[$randomIdx];
}

# Eof


================================================
FILE: proxies.conf
================================================
177.19.206.11:8080
189.113.64.122:8080
187.115.151.36:8080
91.183.109.156:8080
80.167.238.77:1080
62.243.224.180:1080
80.63.56.146:1080
151.1.196.15:80
78.46.212.216:3128
94.192.40.199:1055
82.165.35.26:6515
93.166.121.107:8118
213.197.81.53:3128
71.178.213.38:1387
94.199.183.50:80
173.79.236.8:1090
195.138.76.136:3128
208.58.31.35:1923
89.207.68.10:1080
96.252.98.13:1254
89.188.141.51:80
24.61.214.23:1585
68.60.164.3:1670
98.226.60.95:1743
24.228.50.55:1208
212.45.5.172:3128
38.105.180.252:3128
98.215.237.134:1695
216.155.139.115:3128
24.60.138.133:1941
69.127.120.81:1887
108.3.177.51:1739
67.164.115.117:1759
78.111.247.217:1080
88.247.125.227:8086
74.167.236.218:1909
82.222.49.81:8080
84.53.240.61:3128
109.108.78.5:3128
78.139.74.103:8080
80.63.56.146:8118
24.217.157.96:1631
88.250.208.152:8088
178.49.100.138:3128
195.150.129.230:3128
64.85.181.45:8080
188.231.0.15:8088
123.201.99.200:1080
109.111.191.53:3128
85.132.122.138:3128
92.255.194.185:80
67.168.186.161:1779
75.69.5.128:1281
82.222.49.65:8080
88.85.125.78:8080
82.222.19.54:8080
99.39.236.12:1080
95.56.234.92:3128
80.62.217.19:9100
94.158.111.63:808
188.226.102.246:8000
94.158.111.63:1080
194.0.229.54:9050
208.110.220.133:8080
183.82.96.106:8080
190.144.147.198:80
81.223.49.99:8080
66.90.146.51:1021
92.246.139.180:3128
184.72.48.52:8118
82.222.48.1:8080
219.90.112.55:1080
78.153.151.7:3128
93.167.245.178:9100
84.41.108.74:8080
190.66.17.53:3128
74.196.50.32:808
75.125.242.146:80
189.1.128.81:3128
121.192.32.221:1080
184.154.188.122:8118
95.170.219.89:8080
200.1.110.146:3128
41.209.15.235:1080
200.107.236.162:80
189.89.208.157:8080
190.147.162.13:3128
186.228.41.162:3128
85.25.151.125:3128
186.4.254.195:8080
84.41.105.1:8080
211.154.142.174:1080
212.23.70.186:3128
180.96.19.196:3128
118.102.230.186:1080
182.163.72.148:8080
183.181.174.169:80
211.76.97.150:80
217.64.29.73:3128
125.63.65.178:8080
219.235.110.13:1080
84.22.3.32:8080
123.50.56.206:80
65.209.244.80:8080
211.142.24.122:1080
221.231.114.147:8080
190.152.249.73:3129
200.150.66.226:3128
116.248.41.72:1080
180.96.19.25:8080
187.5.159.218:8080
222.238.156.249:8080
88.135.217.66:3128
190.151.111.202:8080
203.201.163.2:8000
211.76.97.147:80
211.239.84.130:443
109.251.143.22:8080
202.75.54.154:3128
222.255.27.150:31280
98.126.137.212:80
118.97.69.148:3128
222.238.156.245:80
109.251.34.19:8080
217.196.113.81:8080
69.134.37.246:8085
122.154.163.249:8888
222.238.156.249:80
2.184.192.15:8080
213.187.113.127:80
180.96.19.24:8080
190.3.106.166:80
118.97.164.75:8080
187.72.145.53:8080
196.25.36.180:8080
98.126.137.211:80
123.139.215.85:3128
202.51.120.230:80
223.27.145.171:3128
69.77.144.147:8087
211.76.97.148:80
112.65.219.72:80
58.67.147.194:8080
210.22.13.104:80
219.93.221.138:8080
59.172.208.186:8080
58.67.147.205:8080
118.97.32.12:3128
59.175.163.8:80
60.11.62.176:8088
211.86.157.95:3128
46.180.99.40:8080
82.131.174.21:8080
202.107.44.108:8080
119.246.39.171:808
177.36.242.17:8080
58.67.147.207:8080
193.253.191.26:8090
222.185.237.37:3128
183.179.146.153:8909
190.151.111.202:3129
61.167.49.188:8080
222.52.99.131:8081
122.72.30.119:80
58.67.147.206:8080
122.226.50.66:8080
210.19.8.218:8080
210.22.153.98:3128
177.36.242.29:8080
116.55.19.96:808
82.148.74.154:8080
118.97.208.194:3128
150.140.184.110:3128
124.172.250.177:3128
59.175.137.122:3128
203.114.105.243:8080
58.68.232.85:3128
202.117.35.249:80
222.177.13.25:3128
211.139.10.173:80
213.217.58.25:8080
58.67.147.196:8080
177.36.242.1:8080
58.67.147.201:8080
89.135.63.36:80
122.72.12.86:80
202.148.16.2:8000
85.131.163.219:3128
58.67.147.202:8080
177.36.242.5:8080
188.93.20.179:8080
118.174.131.94:3128
210.101.131.232:8080
95.31.214.137:3128
194.36.161.212:3128
218.25.249.185:80
210.86.239.131:3128
58.67.147.204:8080
58.67.147.200:8080
14.198.198.220:843
182.48.54.133:8080
122.72.12.87:80
221.212.196.27:8080
200.11.76.166:8118
202.203.132.26:3128
162.105.139.109:3128
98.103.7.148:3128
202.77.107.222:3128
222.169.224.206:80
177.36.242.9:8080
88.249.28.20:8088
121.96.247.25:3128
121.243.243.118:1080
177.36.242.21:8080
122.72.10.204:80
217.92.195.175:3128
210.101.131.231:8080
113.230.76.234:443
61.235.92.227:8909
211.76.97.145:80
111.118.179.232:3128
61.185.143.178:8080
122.72.12.90:80
122.193.31.69:3128
60.191.49.123:3128
220.113.15.21:1080
64.69.40.254:3128
177.36.242.13:8080
203.168.225.50:8909
211.152.36.99:80
218.203.107.167:80
208.73.211.108:80
41.89.211.5:80
60.209.5.13:808
178.48.15.181:8080
49.212.112.118:3128
116.38.254.26:18080
221.179.41.22:3128
178.17.80.109:8080
88.247.198.177:8088
122.72.12.88:80
58.177.172.231:8909
211.155.128.46:8080
60.213.44.50:3128
210.212.20.170:3128
211.76.97.152:80
212.192.120.67:3128
61.47.57.234:3128
72.51.36.221:8090
200.140.77.171:3128
50.22.88.80:3128
122.72.26.208:80
216.221.204.239:8085
210.51.23.136:8118
189.22.230.70:3128
113.16.185.35:8909
202.103.215.203:80
89.135.63.36:8040
123.88.42.187:8909
218.19.119.99:8080
123.234.215.230:81
193.2.191.211:80
58.67.147.208:8080
157.100.157.154:3128
113.230.76.234:80
66.146.193.31:8118
79.120.197.202:8080
117.79.235.90:80
202.111.188.117:808
219.145.93.110:8080
88.250.175.5:8088
111.1.33.138:80
201.161.45.165:8080
195.135.214.226:8080
118.97.94.19:8080
202.148.28.154:8000
122.255.120.246:8080
221.7.147.172:3128
95.110.227.178:81
200.68.18.178:80
173.224.120.54:3128
82.206.129.160:3128
212.182.64.86:3128
61.190.28.166:8080
81.223.49.110:8080
118.192.1.163:3128
190.82.89.155:3128
77.65.19.35:3128
163.26.71.123:8080
202.116.62.218:808
190.116.35.20:3128
123.124.158.227:8080
202.148.5.234:3128
121.14.9.76:80
222.124.214.60:8080
201.236.80.197:3128
61.166.155.230:8080
119.160.167.57:8118
59.90.219.85:8080
81.210.9.38:3128
189.19.251.225:80
122.72.12.89:80
81.223.49.109:8080
123.125.156.201:80
61.221.217.196:3128
202.137.18.40:80
211.94.93.224:3128
81.223.49.98:8080
61.147.67.61:3128
61.221.251.64:808
200.196.234.26:8080
81.223.49.107:8080
76.74.239.9:8090
213.140.116.188:9090
118.144.88.203:9999
221.7.228.137:80
205.251.132.51:8080
74.86.121.231:3128
119.160.177.2:8118
187.45.214.4:8080
118.174.0.155:3128
190.144.186.171:3128
190.82.89.156:3128
85.248.9.99:8080
187.53.149.22:8080
121.52.71.23:80
85.114.132.49:3128
194.150.220.113:3128
220.194.59.162:80
201.16.219.101:3128
200.181.109.20:80
31.25.137.202:8080
202.75.53.218:9090
74.86.121.229:3128
91.213.87.3:3129
190.152.249.73:3128
221.130.162.52:80
95.76.74.105:8000
74.86.121.228:3128
117.79.235.94:80
78.186.126.244:8086
117.239.2.85:6588
78.187.51.94:8086
59.125.82.23:8080
124.42.77.187:80
93.184.69.250:3128
61.134.121.237:80
41.75.201.146:80
116.248.41.72:808
58.254.134.201:8080
58.67.147.198:8080
120.71.46.225:8909
200.249.86.2:3128
187.60.96.7:3128
110.234.205.50:3128
41.73.2.35:8080
69.169.145.80:3128
222.77.69.210:3128
189.112.189.28:3128
58.67.147.203:8080
93.126.43.244:3128
187.72.145.54:8080
211.239.84.213:80
61.135.208.37:8081
208.43.216.138:3128
202.75.54.155:3128
203.167.31.190:80
64.34.197.103:8118
194.150.220.35:3128
200.54.92.187:3128
217.219.175.81:8080
187.52.188.127:80
195.206.38.53:3128
190.0.41.78:80
200.93.182.18:3128
187.72.83.110:8080
200.54.92.187:80
201.219.17.45:3128
164.151.129.37:80
200.113.15.74:3128
82.148.109.68:3128
190.121.20.131:3128
85.72.35.240:8080
41.0.65.71:8080
84.25.123.69:8080
201.65.32.131:80
186.42.197.178:3128
113.10.164.138:80
62.121.64.19:8080
190.0.45.98:8080
118.97.75.226:8080
174.129.210.43:27977
122.255.120.113:8080
202.53.164.30:8080
94.78.80.190:8080
182.71.75.30:3128
201.90.8.115:3128
186.201.27.66:3128
213.217.58.25:80
211.23.11.110:3128
41.234.202.225:8080
190.254.20.42:8080
190.85.84.122:8085
201.30.132.51:3128
196.2.73.246:3128
200.253.116.4:3128
186.3.41.17:3128
189.17.66.85:3128
200.52.220.45:3128
200.111.123.253:3128
196.216.56.18:3128
200.198.116.54:3128
125.163.208.166:8080
189.108.118.194:3128
217.219.67.148:3128
211.115.83.21:80
200.253.116.3:3128
196.205.71.54:8080
189.85.16.152:80
186.113.186.94:8080
217.218.251.189:3128
200.251.62.51:3128
118.97.18.250:3128
200.37.63.11:3128
119.235.54.42:3128
95.170.219.89:80
41.84.130.229:3128
186.4.233.2:80
122.72.0.6:80
200.61.168.141:8080
119.18.148.46:8080
122.152.183.180:80
122.72.12.85:80
202.39.208.69:80
220.168.248.106:80
189.75.171.226:3128
190.147.134.217:3128
187.0.222.167:3128
178.48.2.237:8080
60.251.189.134:3128
190.108.80.195:3128
186.194.7.185:8080
210.176.171.236:8080
122.117.43.13:808
60.191.232.230:80
190.96.64.234:8080
220.168.248.101:80
93.116.142.104:8080
93.114.61.245:8080
116.90.208.30:8080
200.37.200.71:8080
201.76.215.77:3128
187.72.86.146:3128
200.37.170.141:3128
210.176.171.237:8080
124.244.242.72:3128
190.98.248.114:80
190.144.250.172:3128
220.135.92.184:3128
82.148.109.68:3270
121.88.250.204:80
115.109.122.79:3128
190.249.167.176:8080
186.46.187.43:3128
61.183.254.88:808
201.73.83.130:3128
123.84.14.72:80
118.98.31.6:8080
200.7.201.243:8080
187.111.11.82:8080
190.95.246.3:3128
187.102.64.137:8080
200.171.181.130:3128
186.3.41.22:3127
187.6.87.218:3128
190.152.76.254:8080
202.88.225.155:80
218.29.54.105:80
118.98.31.6:3128
222.83.210.45:8080
183.63.204.131:8080
190.26.91.173:3128
210.212.55.194:3128
190.196.162.155:8080
222.42.45.51:3128
189.113.64.122:8080
200.195.155.114:8080
202.91.246.215:3128
187.63.15.61:3128
222.124.219.219:8080
95.82.26.58:80
222.88.95.66:80
58.177.198.69:3128
83.128.92.181:80
61.19.127.131:8080
211.110.204.67:80
101.44.1.107:80
123.196.125.45:80
186.96.252.185:8085
41.35.46.156:8080
200.27.114.228:8080
200.68.18.178:8080
195.158.101.184:3128
41.210.52.202:8080
200.142.107.26:3128
211.143.40.230:80
41.35.46.128:80
122.155.0.190:3128
122.72.30.120:80
116.90.208.30:443
200.223.17.204:8080
58.60.231.78:3128
218.189.26.158:8080
202.148.29.243:8000
218.204.97.86:3128
189.124.94.118:8080
200.137.130.31:3128
182.72.170.14:3128
146.83.193.75:3128
201.54.230.75:3128
118.96.31.91:3128
122.48.31.77:80
187.49.83.85:8080
187.0.181.166:80
58.83.224.217:8080
187.5.159.218:80
41.89.211.5:8080
113.108.181.171:3128
202.162.216.138:8080
217.219.28.50:80
207.219.7.136:80
92.61.182.200:80
27.50.21.45:8080
218.75.149.147:8080
177.39.212.109:3128
186.3.71.155:8080
122.207.3.66:80
201.57.146.136:3128
122.72.33.139:80
200.155.57.81:3128
195.208.249.40:3128
187.102.64.131:8080
88.12.58.163:3128
190.74.124.107:3128
187.72.136.140:3128
202.127.1.88:3128
118.97.8.106:8080
210.105.194.99:3128
200.114.84.2:8090
121.10.243.44:3128
189.91.231.88:3128
196.212.156.30:8080
178.219.243.114:3128
113.9.163.204:8080
187.111.223.10:8080
200.161.199.121:3128
221.5.71.188:3128
82.207.81.150:4866
223.27.145.172:3128
217.66.212.243:8080
122.72.28.19:80
113.53.252.131:3128
92.53.39.129:3128
177.52.17.162:3128
112.109.20.154:8888
201.209.205.147:3128
187.19.202.166:8080
202.46.127.241:8080
219.137.229.210:3128
222.184.9.242:3128
187.33.229.133:8080
189.2.17.21:3128
118.97.235.234:3128
125.88.75.151:3128
211.139.10.169:80
202.127.28.67:3128
117.34.7.21:3128
190.128.170.18:8080
222.124.217.170:8080
83.111.38.131:3128
217.219.28.50:3128
122.225.68.125:8181
218.210.199.252:3128
123.49.1.90:3128
218.93.118.154:3128
122.154.140.50:8080
118.97.209.218:8080
178.48.2.237:80
202.148.25.58:80
81.89.211.123:3128
222.124.5.82:8080
211.110.204.61:80
210.182.240.28:80
115.236.98.109:80
187.62.95.118:3128
189.36.139.234:9090
218.203.107.169:80
187.75.254.26:3128
202.143.168.140:3128
202.148.25.58:8080
190.103.220.36:8080
203.76.106.67:8080
202.159.223.52:3128
186.225.106.150:3128
203.114.112.101:3128
203.172.209.133:3128
78.39.56.18:3128
61.136.59.177:80
59.37.163.156:3128
92.63.15.226:8080
220.135.70.173:3128
118.96.152.155:8080
190.196.19.28:3128
190.121.22.186:8080
116.197.166.209:80
217.219.175.72:8080
122.160.154.254:3128
212.119.71.201:80
190.128.222.114:80
58.59.141.120:8081
187.33.251.104:3128
190.29.24.23:8080
182.99.127.29:80
218.14.227.197:3128
41.215.5.82:3128
123.196.114.70:80
202.179.82.28:3128
110.138.208.116:8008
61.54.26.44:8080
60.250.109.87:3128
85.185.166.163:8090
210.43.128.18:3128
122.225.107.27:80
122.72.33.138:80
111.94.140.111:3128
190.144.13.66:3128
85.97.190.122:8080
186.250.3.20:3128
202.91.234.152:3128
187.6.254.19:3128
201.252.251.195:8080
81.195.40.2:8080
117.102.101.219:8080
150.165.29.47:3128
202.137.18.66:3128
201.38.194.50:3128
203.89.25.186:8080
218.85.137.11:80
186.251.177.226:3128
202.143.146.205:8080
177.52.17.178:3128
180.139.91.27:8080
189.3.13.131:3128
203.113.102.61:8080
65.55.73.222:80
85.185.226.25:8080
210.9.41.116:80
200.251.62.50:3128
200.114.85.69:8090
60.250.20.107:3128
110.139.150.155:8080
122.225.19.181:3128
114.33.112.160:8181
118.97.237.109:8080
187.102.201.1:3128
200.140.77.170:3128
219.157.200.18:3128
116.90.208.187:8000
58.253.192.122:80
222.184.9.243:3128
203.66.188.252:8080
186.125.158.235:8080
202.137.27.170:3128
202.171.34.234:3124
200.42.69.94:8080
189.75.117.33:8080
122.225.108.110:3128
61.189.33.221:8080
113.53.252.106:8080
110.139.150.155:80
189.3.225.99:3128
203.189.89.153:8080
200.111.115.173:8080
60.190.136.90:3128
222.124.33.33:3128
118.97.75.189:3128
118.174.1.186:3128
222.124.33.33:80
118.97.99.186:8080
219.130.39.9:3128
2.179.143.47:8080
223.27.145.173:3128
91.75.24.162:3128
201.24.79.29:3129
118.97.164.178:8080
222.76.219.77:80
177.53.146.234:3128
200.167.185.179:3128
217.66.212.241:8080
190.255.39.146:3128
186.109.89.208:3128
41.215.26.166:80
202.79.18.28:8080
110.139.133.16:3128
200.198.116.61:3128
41.234.202.122:80
203.190.190.68:8080
175.100.114.170:8080
112.85.42.69:80
113.53.240.90:3128
201.65.237.68:3128
200.198.116.62:3128
80.191.122.21:3128
196.1.178.254:3128
193.194.85.94:8080
187.11.171.61:3128
219.83.100.197:8080
64.152.0.45:80
187.62.103.10:8080
109.204.121.123:80
186.24.8.130:3128
78.39.56.19:3128
182.72.244.210:80
110.138.215.112:8080
190.249.188.229:80
182.72.244.210:8080
27.124.88.237:8080
186.235.108.187:8080
220.189.209.244:3128
187.6.245.123:80
202.143.152.52:3128
200.107.148.138:8080
41.35.47.164:8080
119.1.174.28:3128
120.35.31.101:8080
190.210.96.17:80
12.170.91.242:3128
200.107.236.162:8080
190.226.225.16:8080
78.45.134.216:3128
118.97.94.194:80
118.98.35.251:8080
222.60.8.86:80
195.7.32.214:8080
116.66.204.131:3128
119.145.197.69:8080
186.216.160.147:8080
200.161.84.42:3128
110.139.13.121:8080
118.97.195.106:8080
123.234.31.130:8090
202.162.223.106:8080
221.133.241.126:8080
200.27.114.228:80
122.48.31.76:80
186.113.181.50:80
114.199.126.122:3128
202.44.14.73:8080
61.144.23.76:8080
41.236.175.88:8080
124.199.126.26:80
221.195.42.195:8080
201.66.67.2:3128
187.0.221.195:3128
218.240.42.238:3128
125.65.110.183:8080
60.12.117.28:80
221.2.159.175:8080
190.253.95.229:8080
118.97.195.202:8080
189.127.176.49:8080
118.96.9.139:80
110.137.128.33:8080
189.254.250.34:80
190.196.19.157:3128
201.39.157.226:3128
41.162.7.234:8081
120.197.10.142:3128
182.50.8.213:443
221.122.53.228:3128
189.17.58.194:3128
118.97.94.234:8080
200.59.34.102:8080
95.9.85.101:3128
203.113.116.115:8080
60.29.101.130:8080
190.203.101.57:8080
217.218.212.194:3128
222.124.178.98:3128
110.139.59.89:3128
190.41.192.103:8080
115.96.32.18:3128
60.217.32.143:88
77.123.88.13:3128
187.11.211.88:3128
187.32.101.68:3128
177.52.17.179:3128
81.213.157.71:80
77.105.45.59:8080
219.83.100.205:8080
200.60.11.22:8080
77.95.95.205:80
190.200.11.181:8080
186.153.181.226:8080
110.136.203.65:8000
118.96.93.121:8080
217.218.212.197:3128
118.96.153.161:3128
190.41.82.224:80
190.37.110.206:8080
190.90.100.104:8000
190.253.213.196:3128
97.81.243.127:8080
118.96.231.137:8080
59.90.213.18:3128
116.50.30.36:8080
190.249.191.189:80
218.152.121.184:8080
190.41.82.224:8080
41.35.47.124:8080
122.52.117.92:8080
41.202.0.131:3128
218.28.111.46:8080
118.96.79.252:8080
189.2.17.22:3128
81.223.49.108:8080
88.131.82.241:3128
186.201.111.34:8080
202.28.110.17:8080
118.96.251.234:8080
121.12.250.201:3128
117.239.112.50:8080
187.28.83.131:8080
189.254.250.34:8080
201.248.236.105:8080
111.93.166.202:8080
109.75.64.30:3128
200.45.236.52:80
190.105.161.208:3128
61.93.130.39:80
110.138.210.160:8080
115.178.127.130:8080
202.78.196.34:8080
201.76.212.108:3128
101.50.17.25:8080
190.40.61.40:3128
190.40.7.107:80
125.161.127.160:8080
189.50.142.94:80
109.205.114.178:8080
41.73.2.36:8080
201.219.17.45:8080
202.102.58.209:80
200.253.116.5:3128
200.253.116.2:3128
210.212.194.60:3128
193.227.178.5:80
190.249.188.229:8080
110.139.118.95:8080
119.161.238.90:80
202.164.211.74:8080
186.219.238.34:8080
201.249.94.27:3128
118.96.228.243:3128
125.165.54.9:8080
110.139.151.124:8080
180.243.86.215:8080
91.120.21.169:80
190.121.135.178:8080
110.139.166.35:8080
115.85.65.162:8080
218.28.142.100:8080
118.96.105.143:8080
186.114.134.20:8080
196.214.91.206:3128
119.18.148.41:8080
31.151.46.89:8080
122.166.226.241:8080
110.138.146.30:8080
81.223.49.104:8080
189.3.23.3:9090
203.113.118.37:3128
203.153.121.131:8080
110.138.100.145:8888
189.16.82.34:8080
121.22.34.166:3128
213.175.164.218:80
200.198.119.210:8080
190.196.19.129:3128
200.206.129.229:80
201.44.59.158:3128
122.0.66.102:8080
220.157.99.78:8080
70.167.146.73:8080
186.153.120.42:8080
62.56.137.5:8080
202.146.143.17:3128
200.174.143.194:3128
203.153.31.218:8080
200.93.182.18:8080
200.63.71.225:8080
202.51.120.58:8080
81.223.49.106:8080
201.217.59.110:8080
201.248.43.232:3128
110.139.24.116:3128
116.68.207.80:8080
61.166.144.189:8080
93.157.254.37:8080
187.1.120.65:8080
201.65.255.34:8080
202.137.18.162:3128
78.39.68.145:3128
120.88.10.168:8123
75.147.206.177:80
201.210.191.146:8080
114.130.13.222:8080
41.189.36.26:3128
61.141.21.34:8080
110.139.62.192:8080
110.139.57.63:8080
202.146.143.33:3128
80.251.247.14:3128
116.66.194.33:8080
201.251.59.1:80
219.80.4.150:3128
72.64.146.73:3128
46.109.21.107:8080
189.36.137.66:8080
201.88.254.242:80
192.162.150.77:3128
124.161.63.194:80
41.234.202.1:8080
217.219.133.30:3128
202.62.118.138:3128
202.77.107.110:8082
201.15.62.235:8080
93.99.113.91:8080
118.96.135.49:8080
200.37.200.71:80
196.32.195.42:3128
122.54.181.125:3128
186.153.121.18:8080
188.173.80.134:8080
201.130.47.33:80
187.75.129.184:3128
110.136.145.41:3128
213.244.81.245:8080
41.75.201.146:8080
116.90.209.91:8080
201.57.117.242:8080
222.60.8.66:8080
116.90.208.187:8080
41.32.36.202:80
110.138.215.48:80
27.131.130.66:8080
118.96.92.93:3128


================================================
FILE: regex.conf.sample
================================================
<!--
  pastemon.pl regex configuration file (sample)
//-->

<config>
        <regex>
                <search>rootshell\.be</search>
        </regex>
        <regex>
                <search>anonbelgium</search>
        </regex>
        <regex>
                <search>-----BEGIN (RSA|DSA) PRIVATE KEY-----</search>
        </regex>
        <regex>
                <search>-- phpMyAdmin SQL Dump</search>
        </regex>
	<!--
	Example of exclusion: search for "belgium" without "france" in the same pastie
	//-->
        <regex>
                <search>belgium</search>
		<exclude>france</exclude>
        </regex>
	<!--
	Example fo inclusion: search for belgium" and "luxembourg" in the same pastie
	//-->
        <regex>
                <search>belgium</search>
		<include>luxembourg</include>
        </regex>
	<!--
	Example of match 5 IP addresses
	//-->
	<regex>
		<search>\d+\.\d+\.\d+\.\d+</search>
		<count>5</count>
	</regex>
</config>


================================================
FILE: user-agents.conf
================================================
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1
Opera/9.20 (Windows NT 6.0; U; en)
Googlebot/2.1 ( http://www.googlebot.com/bot.html)
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20060127 Netscape/8.1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)
Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:12.0) Gecko/20100101 Firefox/12.0
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; PeoplePal 6.2)
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0
Download .txt
gitextract_v29hyv1s/

├── README
├── pastemon.conf.sample
├── pastemon.pl
├── proxies.conf
├── regex.conf.sample
└── user-agents.conf
Condensed preview — 6 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (74K chars).
[
  {
    "path": "README",
    "chars": 5768,
    "preview": "Introduction\n------------\n\npastemon.pl is a script which runs in the background as a daemon and monitors pastebin.com fo"
  },
  {
    "path": "pastemon.conf.sample",
    "chars": 2450,
    "preview": "<!--\n  pastemon.pl main configuration file sample\n  Note: Features can be disabled by commenting them using standard com"
  },
  {
    "path": "pastemon.pl",
    "chars": 39581,
    "preview": "#!/usr/bin/perl\n#\n# pastemon.pl \n#\n# This script runs in the background as a daemon and monitors pastebin.com for\n# inte"
  },
  {
    "path": "proxies.conf",
    "chars": 18113,
    "preview": "177.19.206.11:8080\n189.113.64.122:8080\n187.115.151.36:8080\n91.183.109.156:8080\n80.167.238.77:1080\n62.243.224.180:1080\n80"
  },
  {
    "path": "regex.conf.sample",
    "chars": 945,
    "preview": "<!--\n  pastemon.pl regex configuration file (sample)\n//-->\n\n<config>\n        <regex>\n                <search>rootshell\\."
  },
  {
    "path": "user-agents.conf",
    "chars": 878,
    "preview": "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1\nOpera/9.20 (Windows NT 6.0; U; en)\nGooglebot/2.1 ( http://www.googlebo"
  }
]

About this extraction

This page contains the full source code of the xme/pastemon GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 6 files (66.1 KB), approximately 25.5k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!