Get public IP Address
Mon, 08/27/2012 - 21:01 — sandipGet current public IP via command line curl and wget.
With curl:
curl icanhazip.com
curl ifconfig.me
With wget:
wget -qO- icanhazip.com
wget -qO- ifconfig.me/ip
- sandip's blog
- Login or register to post comments
- Read more
Check for fake googlebot scrapers
Wed, 10/05/2011 - 12:13 — sandipI noticed a bot scraping using fake GoogleBot useragent string.
Here is a one liner that can detect the IPs to ban:
$ awk 'tolower($0) ~ /googlebot/ {print $1}' /var/www/httpd/access_log | grep -v 66.249.71. | sort | uniq -c | sort -n
It does a case-insensitive awk search for keyword "googlebot" from apache log file removing IPs with "66.249.71." which belongs to google and prints the output in a sorted hit count.
You can validate the IPs with:
IP=66.249.71.37 ; reverse=$(dig -x $IP +short | grep googlebot.com) ; ip=$(dig $reverse +short) ; [ "$IP" = "$ip" ] && echo $IP GOOD || echo $IP FAKE
Replace the IP value with the one you want to check.
- sandip's blog
- Login or register to post comments
- Read more
Find files used for htauth
Wed, 04/06/2011 - 15:39 — sandipBelow will list all of the files that are used for apache authentication in /var/www/html file path:
find /var/www/html -name .htaccess | xargs awk '{sub(/^[ \t]+/,"")};/File/{pr int $2}' | sort | uniq
Here is the breakdown:
find /var/www/html -name .htaccess
Find all files named ".htaccess" at path "/var/www/html"
xargs awk '{sub(/^[ \t]+/,"")};/File/{pr int $2}'
The search output gets piped via xargs to awk, deleting leading whitespace (spaces and tabs) from front of each line and output is of only the second field of lines containing the text "File".
sort | uniq
Awk output is further piped through sort and uniq which results in the files being used for apache authentication.
- sandip's blog
- Login or register to post comments
- Read more
Remote backups with tar over ssh
Mon, 02/28/2011 - 15:26 — sandipBelow is example of backing up users' home directory to remote host piped via ssh:
tar -cvzf - -C /home {username} | ssh {remotehost} 'cat >/path/to/bak/{username}.tg z'
- sandip's blog
- Login or register to post comments
- Read more
check webpage load time via wget
Tue, 02/22/2011 - 17:19 — sandipHere is a simple one liner to check on download time of a webpage:
(time wget -p --no-cache --delete-after www.linuxweblog.com -q ) 2>&1 | awk '/real/ {print $2}'
- sandip's blog
- Login or register to post comments
- Read more