July 2016: Supratim Sanyal's Computing Blog | Wandering Digital Wastelands as a Geek

Monday, July 25, 2016

httpd410server DNS SINKHOLE: A TINY FREE WEB SERVER TO ALWAYS RETURN HTTP ERROR LOCALLY FOR DNS BLACKLIST REDIRECTION AND LOGGING

Ad-Blocker Operation Using dnsmasq and custom httpd410server

On any Linux system it is very simple to create a DNS redirection-based blacklist filter for ad blocking or malicious website blocking. I use dnsmasq with yoyo ad server blacklists for an effective ad-free home internet environment. Other solutions I have played with include dansguardian based IP filtering and the AdTrap hardware solution itself. Currently my dnsmasq based solution (described here) works well and I am pretty happy with it.

Yoyo's list of ad servers formatted for dnsmasq is basically a long list of TLDs and URLs, each followed by the IP address 127.0.0.1 so that browser requests for ads go to 127.0.0.1. My script that downloads the ad server lists changes the DNS resolution addresses from 127.0.0.1 to a local server 10.42.2.1 that runs a small http and https server daemon I wrote that always returns HTTP 410 (Gone). HTTP 410 has the intended meaning of the client never ever asking for the blocked resource again, though implementation on client web browsers and mobile apps is dubious.

I wrote this little web server daemon called httpd410server because I wanted to see ad-servers being blocked actively in real-time and the web sites or mobile apps requesting them. This little web server is coded to always return HTTP 410 (GONE) whenever any HTTP or HTTPS request comes in, and also logs to syslog the HTTP GET request it responds to. Therefore, I have a way to keep track of all ad server blocking activity in the house.

A free download link to the stuff I describe here is at the bottom of this post.

This daemon is in the class of DNS Sinkholes and not really required for ad blocking; ad blocking works fine with redirection to a IP address with no DNS sink running on the HTTP server port of that IP, resulting in "connection refused" from the operating system. I just did not like redirection to a non-existent service port - it is cool to have something to at least listen on the port, log the ad requests and return a meaningful (intentionally meaningless!) response.

As depicted in the diagram at the top, here is what happens using this setup:
- Someone in the house launches a web-browser and types in a URL, or launches a mobile app
- The web browser or the mobile app loads the main web page and requests resolution of TLDs and URLs for advertisements from ad servers on that page
- dnsmasq intercepts all DNS requests, including the ones for ad servers
- dnsmasq checks the list of ad servers for the TLD or URL being requested. If it finds a corresponding matching entry in the ad-server blocklist, it resolves the TLD or URL to a local IP address (10.42.2.1). Otherwise, the normal Internet IP address for the web resource is returned.
- The user's browser or app then connects to the IP address returned by dnsmasq to load the contents from that server. For ad server requests, the web browser or mobile app connects to 10.42.2.1 as that was the IP resolved to by dnsmasq.
- 10.42.2.1 gets a connection from the web browser or mobile app, logs it into syslog (/var/log/messages) and returns a HTTP 410 (GONE) error response to the requesting web browser or mobile app.
- The net effect is the advertisement is not displayed, and we have a log of the web site requesting ads, for purely academic observation and possible analysis. (Answers to questions like this become easy: What advertising networks do the top news services use? How many requests for ads is a web page making?)

The C++ source for the little web server is given below. It compiles straight away with gcc on Linux. It a minimal multi-threaded web server always returning only one thing - HTTP 410. The -lpthread command line parameter for compilation is required to link with the Linux POSIX threads library.

The source code can also serve as a demonstration or tutorial for beginner programmers delving into the world of Linux C or C++ multi-threaded socket server programming serving multiple ports. Some of the features of my small program that may be of particular interest to novie network programmers are:

socket(), bind(), listen() and then select() and accept() in a loop to listen to more than one port and serve clients. The server listens on both HTTP (80) and HTTPS (443) ports.
multi-threaded TCP socket server using Linux Posix Threads: pthread_create() etc.
use of PTHREAD_CREATE_DETACHED in pthread_attr_setdetachstate() so that the service threads are marked as detached and when they terminate, resources are automatically released back to the system without the need for another thread to join with the terminated thread which is the default behavior.
critical section synchronization using mutex mechanism: pthread_mutex_init(), pthread_mutex_lock and pthread_mutex_unlock(). These are used to keep track of the number of simultaneous clients being served to keep DOS (denial of service) attacks under check using an upper limit of maximum clients supported at the same time
using the syslog (syslogd daemon provided) facility for logging: openlog(), setlogmask(), syslog() etc.
dropping of root privileges after binding the sockets to file descriptors, using setgid() and setuid(). this server listens to protected ports below 1024, and is thus required to startup to the bind phase with root privileges. After binding, it sets its gid and uid to a non-privileged account - a standard security practice for all servers
usage of SO_REUSEADDR to set socket option in setsockopt() to avoid TIME_WAIT induced "address already in use" errors when calling bind() on rapid server daemon restart
usage of SO_RCVTIMEO socket option for receiving data on a socket with timeout - the program uses this to retry to recv() before giving up
usage of the venerable unix select() call including usage of the helpful macros FD_ZERO, FD_SET, FD_ISSET etc.
correct way of shutting down sockets and associated file descriptors - shutdown() with SHUT_RDWR) before close()
general error handling - check all return status and do something intelligent, even if it means exiting the server daemon process in situations no obvious way out can be thought of

Here is a screen full of examples of httpd410server working:

Finally I needed a standard init script to make a service out of the daemon, and that completed the setup. Here is the classic init.d script to start and stop httpd410server. This should be copied into /etc/init.d, after which it can be added to chkconfig using "chkconfig --add httpd410server",

You can download a tarball containing the daemon, source and init script from my google drive.

Sunday, July 24, 2016

ADD A FREE AD MALWARE RANSOMWARE BLOCKER WITH DNSMASQ TO CLEAROS COMMUNITY EDITION / CENTOS LINUX INTERNET SECURITY SERVER

Here is a quick and easy way to block internet ads in your home. This also blocks various tracking services that collect information about your browsing (claiming to be to fine-tune ads for you) and makes internet browsing cleaner and faster. Everything I talk about in this post can be downloaded from a link at the bottom.

I use a ClearOS Community Edition server as the innermost of my three-layer onion internet security gateway (as in layers of an onion, nothing to do with the TOR project in this post, although I will describe my TOR gateway in another post later). Among the many excellent features of ClearOS, the fact that it is built on CentOS makes it instantly familiar to those of us at ease with CentOS who want to customize everything.

Writing the Ad Server Blacklist script for Ad Blocker using dansmasq on ClearOS running on CentOS

ClearOS comes with dnsmasq to serve DHCP and DNS requests. It provides IP addresses to DHCP hosts and forwards DNS requests to a higher-level peer DNS server. In my setup, I have three local area networks served by ClearOS: a dedicated 10.100.0.0 LAN for my hobby projects in the basement, a 10.200.0.0 "rest of the house" LAN and a host-only 10.42.0.0 LAN to provide internet to the DELL PowerEdge blade server itself that hosts all the virtual machines that make up the internet security onion gateway, and other VMs running on it.

As a side note, I named my internet safety gateway system "DORMARTH" after the great hound (also called "Dormarch") from Welsh mythology whose assistance to warriors chasing down the enemy and facilitating the passage of the dead to the other side is legendary. Most of my server hostnames have "dormarth" somewhere as a result.

Coming back to dnsmasq, before adding the ad-blocking feature, I checked on what was installed with CentOS and found the following primary configuration file.

The line "conf-dir=/etc/dnsmasq.d" was immediately interesting, telling me there is a whole directory that I can drop configuration files in for dnsmasq to pick up, without having to change anything in the main configuration file /etc/dnsmasq.conf at all.

Looking at the contents of /etc/dnsmasq.d, I found a single file /etc/dnsmasq.d/dhcp.conf which tells dnsmasq what interface to serve DHCP requests on, what IP address ranges to use for DHCP clients, how long DHCP sessions are valid before requiring renewal, etc.

The only other file I checked on was the resolver configuration file /etc/resolv-peerdns.conf referenced by /etc/dnsmasq.conf. This has one line, basically defining the peer DNS host to forward requests to. In my case, the ClearOS server forwards DNS requests to the 2nd of my three-layer onion security gateway (a Sophos UTM server):

[root@anubis-clearos ~]# cat /etc/resolv-peerdns.conf

nameserver 10.42.1.1

Now, we have enough information to add ad blocking to the dnsmasq configuration so that it stops known advertising websites from delivering ads to its DHCP and static clients.

I decided to use the great lists of advertising servers and URLs maintained by Yoyo Internet Services. The types of lists we can download from Yoyo can be chosen using selection drop-down lists and checkboxes at their "Ad blocking with ad server hostname and IP addresses" web page.

Among the numerous types of ad server list file formats, the two that I am interested in are the lists designed for dnsmasq using "address=" lines, and dnsmasq using "server=" lines. Selecting the corresponding items from the drop-down lists, choosing "no links back to this page" and checking the "view list as plain text" gives us the following two lists that we can simply dump into directory /etc/dnsmasq.d via a cron job and restart dnsmasq.

I wrote a basic shell script to download these ad server lists and drop them into /etc/dnsmasq.d, and saved it into /root/adblocker/adblocker-dnsmasq.sh. Here is adblocker-dnsmasq.sh

The original adserver list file adblocklist.conf downloaded by wget contains the IP address of 127.0.0.1 that all the domains and URLs resolve to. However, I run a little custom HTTP server that responds with a HTTP code of 410 (GONE) to all requests while logging the request. The purpose of this minimal server (I have described it in detail with the source code here) is to log all advertising server requests that dnsmasq blocks based on the block-list. To do that, the 127.0.0.1 addresses in the block-list need to be changed to the IP 10.42.2.1 of the server running the small HTTP 502 responder. This is accomplished by the line "sed -i 's/127.0.0.1/10.42.2.1/g' /etc/dnsmasq.d/adblocklist.conf" in the above script. I will write separately about the little server, but there is an example of what it produces later in this post.

Another subtlety of the above script is that wget preserves the time of the original file it gets from the remote site. The folks at yoyo update the lists last on July 21, and today is July 25. When I ran the script today, both adblocklist.conf and adblockserverlist.conf were fetched successfully by wget with July 21 dates on the files, but the sed command ran on adblocklist.conf, thus changing the file time of adblocklist.conf to the current. adblockserverlist.conf is not manipulated at all and the original file date is preserved.

The final piece is adding the adblocker-dnsmasq.sh script to cron so that the lists get updated automagically. I chose a weekly update schedule because advertising servers do not pop up or disappear very frequently. For the cron job, I created the file update-adblocker-dnsmasq in /etc/cron.d/ containing a directive to crond to run the ad server list updater on a schedule:

To test, I ran the updater command line directly and verified that the ad server block lists were indeed making it to the correct directory /etc/dnsmasq.d:

[root@anubis-clearos adblocker]# ls -l /etc/dnsmasq.d
total 148
-rw-r--r-- 1 root root 84187 Jul 25 03:26 adblocklist.conf
-rw-r--r-- 1 root root 60357 Jul 21 09:38 adblockserverlist.conf
-rw-r--r-- 1 root root 523 Aug 2 2015 dhcp.conf
[root@anubis-clearos adblocker]#
[root@anubis-clearos adblocker]# date
Mon Jul 25 03:30:31 UTC 2016
[root@anubis-clearos adblocker]# ls -l /etc/dnsmasq.d
total 148
-rw-r--r-- 1 root root 84187 Jul 25 03:26 adblocklist.conf
-rw-r--r-- 1 root root 60357 Jul 21 09:38 adblockserverlist.conf
-rw-r--r-- 1 root root 523 Aug 2 2015 dhcp.conf
[root@anubis-clearos adblocker]# ls -l /var/log/adblocker-dnsmasq.log
-rw-r--r-- 1 root root 144817 Jul 25 03:26 /var/log/adblocker-dnsmasq.log
[root@anubis-clearos adblocker]# ls -l /tmp/adblock*
-rw-r--r-- 1 root root 84187 Jul 25 03:07 /tmp/adblocklist.conf.bak

-rw-r--r-- 1 root root 60357 Jul 25 03:14 /tmp/adblockserverlist.conf.bak

Also inspecting /var/log/messages for logs from dnsmasq reveals everything is working as expected, which is also confirmed by looking at the log file /var/log/adblocker-dnsmasq.log:

Shutting down dnsmasq: [ OK ]
Starting dnsmasq: [ OK ]
Mon Jul 25 03:26:15 UTC 2016

/etc/dnsmasq.d/adblocklist.conf
---
address=/.../10.42.2.1
address=/.../10.42.2.1

address=/.../10.42.2.1

...

address=/.../10.42.2.1

---

/etc/dnsmasq.d/adblockserverlist.conf:

---

server=/.../

---

dnsmasq (pid 11431) is running...

That is all folks.

Now log into any computer or mobile device served by this ad-filtering installation of dnsmasq and browse some ad-heavy web-sites. The sites will load faster and cleaner without the advertising, not to mention your browsing security increases since there is far less amount of third-party websites tracking your behavior. Here is a sample of the ads blocked by dnsmasq as logged by my little 502 server to which the blocked domains and URLs are redirected:

Search

Monday, July 25, 2016

httpd410server DNS SINKHOLE: A TINY FREE WEB SERVER TO ALWAYS RETURN HTTP ERROR LOCALLY FOR DNS BLACKLIST REDIRECTION AND LOGGING

Sunday, July 24, 2016

ADD A FREE AD MALWARE RANSOMWARE BLOCKER WITH DNSMASQ TO CLEAROS COMMUNITY EDITION / CENTOS LINUX INTERNET SECURITY SERVER