Personal computing discussed

Moderators: renee, Steel, notfred

 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Is Facebook getting past my hosts file?

Fri Jun 16, 2017 3:46 pm

I use this to block Facebook. So far, to the best of my knowledge, it has always worked perfectly.

Just now, I got linked to a Guardian article, and as it was loading, I noticed "connected to www. facebook .com" flash by on the status thing in the lower left.1 What? Sure enough, it got through (and presumably has gotten through elsewhere) far enough that Privacy Badger is aware of it and blocking its cookies (but not FB in its entirety). This seems to hold true for The Guardian's entire site. I checked /etc/hosts, and it is alive and well (including the www. facebook .com entry). I pinged www. facebook .com, and as expected, the request didn't leave this computer.

I turned Wireshark on the problem, capturing for about 10 seconds on loading a Guardian page (enough time to confirm Privacy Badger was aware of Facebook's existence on that page), but didn't find anything interesting. It clearly didn't capture a couple of other things PB noted and let through, though. All I can think of is something to do with CDNs, which were prominently represented in the traffic, but I don't know enough about. IP list is here. There weren't any noteworthy DNS requests in that capture, either.

I've attempted to mess with Firefox's debug tools, but there's a ton of garbage on this page cluttering everything up and I'm not always sure what I'm looking at. This (and a duplicate of it) is the only FB thing in net traffic as far as those debug tools can tell - it looks like an obvious pixel tracker that got blocked (by PB?), but I'm still not 100% sure what happened to it. The part that generated that request is obvious enough at least. It's just in HTML, directly below body. There's a stupid amount of source code for this page and I'm not sure where else to look for anything FB related.

I would say PB is blocking it before the DNS request is made based on the timings in FF's debug tools, but FB tries to track basically everything everywhere and with this hosts file I don't usually see PB being aware of it. Also, PB says it's only blocking cookies, so does that mean pixels too?

Any ideas what's happening here? I'll learn how to debug whatever needs to be debugged, I'm just not sure where to go next.

1edited for annoying auto link completion - ignore the spaces in the URLs up there
 
Pizzapotamus
Gerbil
Posts: 50
Joined: Tue Aug 28, 2007 11:18 am

Re: Is Facebook getting past my hosts file?

Fri Jun 16, 2017 5:42 pm

With a hosts file your browser is still going to see webpages telling it to go access things from facebook.com and it will do so, it's just that facebook.com no longer actually points to actual facebook. So seeing a connected to facebook.com seems plausibly harmless to me. Similarly with 3rd party tools telling you're they're blocking something to or from facebook.com if they're just looking at and blocking the connection attempt based on the hostname/url and are unaware that facebook.com no longer actually points to facebook. For something similar I have some domains blocked at the dns level and also run noscript, on plenty of pages I can see those in the list of domains but if I went ahead and allowed that "script" to run it wouldn't actually succeed in loading anything.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Is Facebook getting past my hosts file?

Fri Jun 16, 2017 6:47 pm

Firefox usually reports this stuff in the order looking up X -> connecting to X -> connected to X -> transferring data from X. I think of connecting/connected as meaning it's already handled the DNS side of things, though I'm not clear on what connected means in that sequence (under what circumstances would it have established a connection but not be transferring anything?). Either way, FF should have no idea what connection it's even trying to establish until it gets an IP from DNS, and it should never have had any opportunity to have gotten an IP for Facebook. It shouldn't even be showing up in F12 -> network, right?

I have confirmed that Privacy Badger does note attempted-but-not-completed connections to Facebook, but for most of them it doesn't bother blocking because it can't see any tracking. It learns as it goes rather than having any predefined blocklist, so FB being blocked means that PB won't doubly block it. OTOH, that tracking pixel is pretty obvious and requires no response from FB to do its thing, so maybe PB is smart enough to block it anyway.

Unblocking it in PB doesn't seem to change anything in Wireshark or FF's tools, so I guess the biggest question is what FF saying "connected to" actually means.
 
odizzido
Gerbil Team Leader
Posts: 211
Joined: Fri May 06, 2005 6:10 am

Re: Is Facebook getting past my hosts file?

Fri Jun 16, 2017 7:37 pm

If you're not getting anything on wireshark I would imagine nothing is getting through. Have you tried blocking the book of faces on your router(if you can) to make sure?
 
Pizzapotamus
Gerbil
Posts: 50
Joined: Tue Aug 28, 2007 11:18 am

Re: Is Facebook getting past my hosts file?

Fri Jun 16, 2017 7:52 pm

You've got that one screenshot of the firefox dev tools showing the network tab with the facebook.com entry selected, the fact that it shows as not secure is a sign to me that it's blocked as I'd think a connection to real facebook would end up with working https but if you after selecting that entry you went to the "Headers" subtab rather than "Timings" as in that screenshot doesn't is also show the "remote address"? Which I would think should show 0.0.0.0 and/or a complete lack of response.
 
NovusBogus
Graphmaster Gerbil
Posts: 1408
Joined: Sun Jan 06, 2013 12:37 am

Re: Is Facebook getting past my hosts file?

Sat Jun 17, 2017 1:55 pm

Pizzapotamus wrote:
With a hosts file your browser is still going to see webpages telling it to go access things from facebook.com and it will do so, it's just that facebook.com no longer actually points to actual facebook.

This, unless you've concrete evidence via Wireshark that the actual FB.com servers are being reached. The way a hosts file works is that when the OS gets an outgoing request to a URL, it redirects it to the specified IP address rather than whatever IP address gets provided by the interwebz. So any hyperlink, script, etc. that's asking for *.facebook.com is still going to tell the OS to go there, but it will get pointed at 0.0.0.0. Since this all happens at the OS level, the browser isn't going to tell you anything useful.

It *is* possible to have a situation where a CDN or other middleman serves up something from Facebook, but hosts file won't do anything there since it's not your machine talking to FB but theirs. You'd have to blackball the middleman.
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: Is Facebook getting past my hosts file?

Sat Jun 17, 2017 2:44 pm

A CDN is just a spot to put static content. It's a common practice to put all your dynamic scripts(PHP, Ruby, C#, etc) and static content(images, stylesheets, file downloads) on different hosts to help reduce load on the host(s) serving dynamic content. Sometimes it's mapped to a subdomain, or sometimes it's under a completely different hostname, and usually the DNS for it is load balanced across multiple IPs, but it doesn't really behave any differently than any other web host in terms of requests.

Like others have said, the "block" is probably still working just fine. It's just that the browser doesn't even know it's there. If you want to actually deny the requests instead of just sending them out into the void somewhere, then you'll need a different tool for the job. That'd probably be something you'd do at the router/firewall instead of on your local machine.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
synthtel2
Gerbil Elite
Topic Author
Posts: 956
Joined: Mon Nov 16, 2015 10:30 am

Re: Is Facebook getting past my hosts file?

Sat Jun 17, 2017 3:11 pm

odizzido: Nah, this router is a piece of junk (it doesn't usually matter because the connection is too), and the machine hops networks often enough anyway.

Pizzapotamus: It's the complete lack of response option, not that that would seem to inherently matter for a tracking pixel like this.

The main failure mode I was thinking of here is one where something in that ridiculous mountain of JS has a way of getting to Facebook without going through the usual DNS channels. That would be pretty nutty for all kinds of reasons, but modern web dev is pretty nutty for all kinds of reasons, and I'm not familiar with all that nuttiness in detail. This debugging looks like it's got that ruled out, though.

The main question I've still got is what Firefox means by "connected to".
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: Is Facebook getting past my hosts file?

Sat Jun 17, 2017 4:39 pm

In the network monitor, "connecting" means it's done with DNS resolution and attempting to establish a TCP connection, but it won't actually do that in this case because of your host file entries. I'm assuming the fact that it said "connected to" in the status bar has little to do with what's actually happening internally. As far as I'm aware there will always be a status code returned for each successful request even if there's no data sent as a response, and looking at your screenshot there isn't one. That also makes me think there's no connection being made here.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On