BIND DNS query log shipping into a MySQL database

BIND DNS query log shipping into a MySQL database

Yay!, I’ve been wanting to do this for a while! Here it goes:-

Documented herein is a method for shipping BIND DNS query logs into a MySQL database and then reporting upon them!

Note: SSH keys are used for all password-less log-ons to avoid prompt issues

BIND logging configuration

BIND named.conf query logging directive should be set to simple logging:-

logging{

  # Your other log directives here

  channel query_log {
    file "/var/log/query.log";
    severity info;
    print-time yes;
    print-severity yes;
    print-category yes;
  };

  category queries {
    query_log;
  };
};

The reason why a simple log is needed is because the built-in BIND log rotation only allows rotation granularity of 1 day if based on time, hence an external log rotation method is required for granularity of under 24 hours.

BIND query log rotation

My external BIND log rotation script is scheduled from within cron and it looks like this:-

#!/bin/bash
QLOG=/var/named/chroot/var/log/query.log
LOCK_FILE=/var/run/${0##*/}.lock

if [ -e $LOCK_FILE ]; then
  OLD_PID=`cat $LOCK_FILE`
  if [ ` ps -p $OLD_PID > /dev/null 2>&1 ` ]; then
    exit 0
  fi
fi
echo $$ > $LOCK_FILE

cat $QLOG > $QLOG.`date '+%Y%m%d%H%M%S'`
if [ $? -eq 0 ]; then
  > $QLOG
fi
service named reload

rm -f $LOCK_FILE

Place this in the crontab, working at between one and six hours, ensure it is not run on the hour or at the same time as other instances of this job on associated servers

make sure /var/named/chroot/var/log/old exists for file rotation, used in the data pump script later on.

From here, I create a MySQL table, called dnslogs with the following structure:-

create table dnslog (
  q_server   VARCHAR(255),
  q_date     VARCHAR(11),
  q_time     VARCHAR(8),
  q_client   VARCHAR(15),
  q_view     VARCHAR(64),
  q_text     VARCHAR(255),
  q_class    VARCHAR(8),
  q_type     VARCHAR(8),
  q_modifier VARCHAR(8)
);

You can either define a database user with a password and configure it such in the scripts, or you can configure a database user which can only connect and insert into the dnslogs table.

Then I use the following shell script to pump the rotated log data into the MySQL database:-

#!/bin/bash
PATH=/path/to/specific/mysql/bin:$PATH export PATH
DB_NAME=your_db
DB_USER=db_user
DB_PASS=i_know_it_is_a_bad_idea_storing_the_pass_here
DB_SOCK=/var/lib/mysql/mysql.sock
SSH_USER=someone
LOG_DIR=/var/named/chroot/var/log
LOG_REGEX=query.log.\*
NAME_SERVERS="your name server list here"

LOCK_FILE=/var/run/${0##*/}.lock

if [ -e $LOCK_FILE ]; then
  OLD_PID=`cat $LOCK_FILE`
  if [ ` ps -p $OLD_PID > /dev/null 2>&1 ` ]; then
    exit 0
  fi
fi
echo $$ > $LOCK_FILE
for host in $NAME_SERVERS; do
  REMOTE_LOGS="`ssh -l $SSH_USER $host find $LOG_DIR -maxdepth 1 -name $LOG_REGEX | sort -n`"
  test -n "$REMOTE_LOGS" && for f in $REMOTE_LOGS ; do
    ssh -C -l $SSH_USER $host "cat $f" | \
      sed 's/\./ /; s/#[0-9]*://; s/: / /g; s/\///g; s/'\''//g;' | \
        awk -v h=$host '{ printf("insert into '$DEST_TABLE' values ( 
'\''%s'\'', 
STR_TO_DATE('\''%s %s.%06s'\'','\''%s'\''), 
'\''%s'\'', 
'\''%s'\'', 
'\''%s'\'', 
'\''%s'\'', 
'\''%s'\'', 
'\''%s'\''
);\n",
h, 
$1, 
$2, 
$3 * 1000, 
"%d-%b-%Y %H:%i:%S.%f", 
$7, 
$9, 
$11, 
$12, 
$13, 
$14
); }' | mysql -A -S $DB_SOCK -u $DB_USER --password=$DB_PASS $DB_NAME 2> $ERROR_LOG
    RETVAL=$?
    if [ $RETVAL -ne 0 ]; then
      echo "Import of $f returned non-zero return code $RETVAL"
      test -s $ERROR_LOG && cat $ERROR_LOG
      continue
    fi
    ssh -l $SSH_USER $host mv $f ${f%/*}/old/
  done
done
rm -f $LOCK_FILE $ERROR_LOG

Put this script into a file and schedule from within crontab, running some time after the rotate job suffice to allow it to complete, but before the next rotate job.

Note that the last operation of the script is to move the processed log file into $LOG_DIR/old/.

This will take each file in /var/named/chroot/var/log/query.\* and ship it into the dnslogs table as frequently as is defined in the crontab.

From here, it is possible to report from the db with a simple query method such as:-

#!/bin/bash
PATH=/path/to/specific/mysql/bin:$PATH export PATH
DB_NAME=your_db
DB_USER=db_user
DB_PASS=i_know_it_is_a_bad_idea_storing_the_pass_here
DB_SOCK=/var/lib/mysql/mysql.sock
SSH_USER=someone
SQL_REGEX='%your-search-term-here%'

LOCK_FILE=/var/run/${0##*/}.lock

if [ -e $LOCK_FILE ]; then
  OLD_PID=`cat $LOCK_FILE`
  if [ ` ps -p $OLD_PID > /dev/null 2>&1 ` ]; then
    exit 0
  fi
fi
echo $$ > $LOCK_FILE

echo "select * from dnslogs where q_text like '$SQL_REGEX';" | \
  mysql -A -S $DB_SOCK -u $DB_USER --password=$DB_PASS $DB_NAME

rm -f $LOCK_FILE

And there it is! SQL reporting from DNS query logs! You can turn this into whatever report you like.

From there, you may wish to script solutions to partition the database and age the data.

Database partitioning should be done upon the q_timestamp value, dividing the table into periods which align with the expectation of the depth for which reporting is expected. On a minimal basis, I would recommend keeping at least 4 days of data in partitions of between 24 hours and 1 hour, depending upon the reporting expectations. If reports are upon the previous day’s data only, then 1 partition per day will do, while reports which are only interested in the past hour or so will benefit from having partitions of an hour. in MySQL, sub-partitions are not worthwhile because they give you nothing more than partitions but adds a layer of complexity on what is otherwise a linear data set.
Once partitioning is established, it should be possible to fulfill reports by querying only the relevant partitions to cover the time span of interest.
Partitioning also has another benefit, which is data aging. Instead of deleting old records, it is possible to drop entire partitions which cover select periods of time without having to create a huge temporary table to hold the difference as would be required by a delete operation. This becomes an extremely useful feature if you have a disk with a table size which is greater than the amount of free space available.

Script updates for add and drop partition to follow….

Why I love admin.com’s MX record!

It’s pretty fair to say that admin.com is probably one of the most abused domains in the world.

I take my hat off to them in their attempt to combat spam.

They tool the simple eloquent solution of setting their MX record to localhost.

This dear reader is pure genius.

It is genius because it means that any DNS-aware mail server carrying mail for admin.com will burn up on repeated local delivery attempts my this MX record to localhost forcing the mail server into attempting delivery to itself.

The added bonus of this method is that mail never hits admin.com’s servers thus ensuring that their servers do not serve a dross of spam.

While it is obvious that this method does not allow delivery of mail if you actually want to receive mail, it is only suitable in this uncommon situation, and hmm, maybe some other situations.

It may possibly be a suitable remedy to eliciting noticeable decommissioning of domains such that the receiving SMTP servers catch no load and the sending SMTP servers get to see all the errors.

This may also be a useful spoofing technique for DNS views within your control if you want to suppress mail to certain domains within a subscribed client-base.

Or maybe suppressing mail from a machine which it is not possible to disable applications from mailing out.

A quick ‘hack’ to test this on any given machine is to alias the given domain to localhost in the /etc/hosts or c:\windows\system32\drivers\etc\hosts file in order to elicit the same outcome.

Caution is recommended – don’t lock-out access to key hosts like yourself or the device’s default router by aliasing critical network nodes. Your mileage may vary – don’t alias the name in which your machine has (if known to the device) for which you are using to administer the given device.

Search Engines and Privacy

Have you ever wondered how search engines actually make their money from advertising?

Have you ever had privacy concerns over the search terms you use?

Have you ever been freaked out by how well targeted modern advertising is?

If you can say yes to any of these – then read on! Otherwise, still read on for an eye-opener.

Search engines only exist because there is a financial model behind them which is there, naturally, to generate profits. So, how do search engines make their profit? Search engines primarily make their money from advertising (as their search and associated services are free to end-users) and they achieve this by three primary techniques:-

  1. Provide space for adverts on the site and rent them out.
  2. Extend scope of adverts through syndication schemes, embedded content and ‘like’ buttons.
  3. Sell your data (search terms and the IP addresses that they come from along with browser unique IDs held in cookies etc) to 3rd parties.

Through the many different tracking technologies available- it is easy to identify and build a profile of an Internet user. This data is collected in the form of search engine logs (on their servers) which can then be analysed either in real-time or at a later date, This statistical analysis provides deep insight into what other products and services might be of interest to you in order to elicit targeted marketing, however there can be a far more sinister use for this data too.

This data can be used to profile a person in order to find out things such as:-

  • Your name and any aliases (such as ‘internet’ names and previous names)
  • Birthday
  • Location
  • Address
  • Telephone Numbers
  • Email Addresses
  • Your interests
  • Tastes in music, clothes, and also food and drink
  • Your faith and beliefs
  • Your friends and associations
  • Your spending habits
  • Where you like to go
  • Times when you are not at home
  • Your car make, model, and registration
  • Where you work and what you do for a living
  • Your thoughts and feelings
  • Pictures of you in places and with people

All of these are used for profiling you in order to place you within a demographic classification which can then be identified and targeted for a number of uses including advertising.

How are these stats collected? The user normally wilfully gives them without a second thought. Sources such as your favourite search engine, along with Facebook, Twitter, Flickr, and other social media, which is then tied together into continuous sessions using tracking cookies.

Many social media tools like facial recognition on Facebook enable people to be accurately associated with others, another social engineering danger, which usually starts with “do you know ‘so-and-so’?”

Many web pages register your presence with syndication partners merely by viewing the page, for example, Google and Facebook get informed of your visit every time you visit a page with some of their ‘like’ buttons (but not all), so in many cases – even if you don’t click it, you may get tracked. Same for YouTube videos on sites other than YouTube itself – YouTube (and thus Google) will know you’ve visited even if you don’t watch any movies because it will have linked back to YouTube in order to provide the ‘player’ for that shared content. In another example, Amazon’s affiliate advertising syndication could be used to track users across pages which show Amazon affiliate adverts.

An interesting concept on social tracking is that if you use your friend’s wireless with your own equipment then there will be a chance to trace your tracking session transferring to a source of other known tracking sessions, thus it is able to also track your physical movements and correlate you to others through simultaneously sharing the same IP. Your iPhone or Android will go with you everywhere, and wherever there is wireless configured (with the correct password), it will use it and tell on you. Being fair on the matter, the telephone companies can do this far easier through 3G phone networks but being in the same proximity does not always positively prove a relationship – unlike sharing an IP.

It is also possible for ISPs to proxy your web traffic for the purpose of caching content in order to deliver a faster network – this can also be used as a source of data.

Many people simply ‘like’ products, creating for themselves an association with which allows others to gauge your persona because when you ‘like’ a product on Facebook, it tells all of your friends (or at least ‘friends’ on Facebook). This could then be used in social engineering attacks against you. Google track you simply for viewing a page with a Google+ button on it.

Google is so elusive – you need their “opt-out” plugin to avoid them – which as discussed before comes with an auto-update program which still ‘phones-home’.

So, by using social, open-source, and purchased data from many different sources such as Facebook, Twitter, MySpace, Google, Bing! Etc it is easily possible to build an accurate profile of a person, their relationships, and their surfing habits which will reveal insight into that person in order to target them for one purpose or another. This indeed is what the advertisers are after hence this is the reason why this data has value.

While this data has value, this data is also private data about individuals, and it is about your data too, which while a few traces will reveal little about you, long term traces can reveal lots more than you realise. How often do you clear your cookie cache? And do you have ‘Do Not Trace’ set on your web browser?

A recent row has broken out between online advertisers and Microsoft, who have taken the bold step to enable ‘Do Not Trace’ on their latest web browsers, which if you’re not aware is the default action of actively blocking tracking cookies. The advertisers are up in arms – and this tells you a lot about the value of the data.

To protect yourself from this form of personal data leakage, you should choose your web browser and search engines wisely. I am currently using SRW Iron for a web browser because it has all the power and prowess of Google Chrome with all of the Google-phone-home stuff taken out and a few safety features made default, and I use duckduckgo.com for a search engine because it supports encrypted connections through https, it does not record your IP in its logs, and they discard your results after 2 days. Duckduckgo.com also has a search portal within Tor and are strong advocates of internet privacy.

See duckduckgo’s privacy statement here:-

https://www.duckduckgo.com/privacy.html

Another search engine worth considering is ixquick, who’s privacy policy can be seen here:-

https://ixquick.com/eng/privacy-policy.html

For comparison, here’s Google’s privacy statement:-

https://www.google.co.uk/intl/en/policies/privacy/

Wow!, need I say more?

It is worth noting that all searches done using standard http can be recorded ‘on-the-wire’ by anyone who is monitoring the traffic. This includes all searches, returned content, and modifications to those searches, often, character-by-character where auto-fill offers search suggestions. For this reason, it is always worth using a search engine which can support https as this will stop a degree (but not all) snooping on the wire.

Another point worth noting is that you often lose protection the second you leave the https encrypted search engine page because you then give away the site which the search engine led you too. In the cases of many search engines, they too track this information, adding to their knowledge base of not only what you searched, but which links you clicked on. Once you click your intended site, you leave the protection of the encrypted search thus revealing your next site, so therefore using an encrypted search does not protect you beyond the initial search you undertake.

So, now knowing the extend of browser tracking, I encourage you to consider this when surfing the Internet, and take measures to protect your privacy.

Cookie Monster

So often I hear about people being concerned with browser cookies – what they are, what they do, and what they tell about you. Well, there’s lots on that already. In this post I recommend that users set their web browsers to clear down all cookies after every session in order to ensure information from one browsing session is not passed to the next. I also recommend the blocking of all 3rd party cookies, and also recommend setting your browser history to “0”. This way, all residual information should be cleared down at the end of each browsing session.

Google analytics is evil (imho)

it is of my opinion that Google is evil, or at least their analytics is evil – they are one of the few search engines where you have to install a browser plugin to opt-out! of their analytics, and to make things worse – it comes with an automatic update system which pollutes your windows services and scheduled tasks in order to make it phone home to google anyway! It somehow reminds me of phorm!

Browser Search Bar Plugins

Browser search bar plugins are like a plague on your privacy as they send stats on you back to their respective companies – only ever use the simple search set inside the top right or address bar of your browser and set it to something secure like https://duckduckgo.com or https://encrypted.google.com. All search bars which offer any kind of automatic search or auto-completion or ‘safe-search’ filtering will provide all of your web surfing traffic to a 3rd party who may then sell or expose that data. For this reason – all browser search plugins unless absolutely necessary (like when provided as part of a malware scanner) should be avoided like the plague.

Running as a non-Administrator user

How did your computer last become infected with a virus? did it install without asking you first? or did it change a load of settings you couldn’t undo? if yes to any of these then it was probably because you were logged-in with administrator privilege.

To mitigate against unwanted changes to your PC – always use a non-Administrator account. Set-up an alternative account with administrator rights and assign it a password such that you always get asked for a password before critical changes are made to your machine – thus allowing you to ask yourself “did i really ask it to change something critical or permanent?”, if not then cancel out of it and prevent it from happening.

Privilege separation is a key aspect to computer security, even the most secure of systems can be made insecure by badly coded programs allowing unprivileged users to modify critical files and settings.

It is too easy to “just-click-yes” and then go oops afterwards, this provides a nice barrier for unwanted and unauthorised changes.

SRW Iron

Who does your browser consult when deciding which sites to go to? well, many browsers come with ‘smart’ technologies which ‘phone-home’ to various vendors and organisations of sorts, providing a potential data leakage of web request urls through services such as ‘safe-search’ and browser search plugins.

SRW Iron is Google Chrome with all the google-phone-home stuff taken out, and while yes, I know you can tweak Chrome to be as quiet if you want – Iron comes secured out of the box and has all functionality where data privacy is a concern is turned off permanently. Get it at:-

http://www.srware.net/en/software_srware_iron_download.php

Recent 0day IE vulnerability causes Microsoft to recommend EMET

A recent 0day on IE caused Microsoft to recommend a lesser-known but long-standing Microsoft tool called the Microsoft Enhanced Mitigation Experience Toolkit, which recently hit v3.0 and along with it official support from Microsoft for use in a production environment.

This is a monumental security milestone for Microsoft as it provides a fix to the reason why certain classes of malicious code can take place thus fixing the flaw which lets it happen rather than catching the attack in hand.

There is a profile included in EMET which you can import and this contains most of the popular applications, and if you review those apps there are certain mitigations turned off on certain apps hence showing evidence of some testing (which you shouldn’t then need to do yourself).

What EMET provides is a strong mitigation for a whole class of vulnerabilities of which target popular software such as web browsers, browser plugins, Adobe Acrobat, Shockwave Flash, and any other application exposed to data from untrusted sources like the internet. The EMET method of mitigation is so successful it is better than antivirus for blocking these types of attacks as it provides protection from future unknown threats of this kind and it never needs ‘updating’ with virus signatures.

I have successfully been running EMET for 5 or so months now in the dangerous ‘opt-out’ for everything configuration without issue. The only mitigation i had issue with was aslr for media players or realtime apps.

While some programs are genuinely badly designed and won’t work with many types of mitigations, the few which actually get killed really need to be questioned – do you want to run code which is so bad it triggers? What i find quite surprising is how many times EMET may close a plugin while i’m browsing!

This toolkit is a must for everyone with a Windows machine, simple. Download EMET now from Microsoft, located here:-

http://support.microsoft.com/kb/2458544