Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:
 

free

 
By Thomas Baekdal - February 2023

Operational security and the dangers of online sharing for journalists

This is an archived version of a Baekdal Plus newsletter (it's free). It is sent out about once per week and features the latest articles as well as unique insights written specifically for the newsletter. If you want to get the next one, don't hesitate to add your email to the list.

Welcome back to the Baekdal Plus newsletter. Today, I have a story about protecting information, because I came across an example of a journalist who thought they were doing everything right, but accidentally revealed information about their sources. I want to talk about how easy it is to make that mistake by giving you a number of examples.

I'm obviously not going to talk about the specific case I saw, but here are some general tips (I posted about most of these before on social media). Of course, using these examples you can either hide things that, as a journalist, you don't want people to know (like information about your sources), or you can use the same example to reveal information that others didn't want you to see.

So, here we go:


One of the fundamental problems with digital is that there are a lot of things we can extract from the data that aren't immediately obvious.

Always delete unwanted audio

Here is a simple example. An ambulance driver posted a video of him going to a scene, but to protect the victim's privacy, he had lowered the audio of the video. He didn't mute the audio, or better, deleted it. He just made it so low that you couldn't really hear it.

Now, of course, anyone who has ever edited a podcast knows that this doesn't do anything. If the audio is simply lowered, all you need to do is to import the file into your audio editor, and bring it back up again.

It takes two seconds to do, and suddenly there is no longer any privacy.

The same is true if you add something on top of the audio. If you are trying to hide a section of an audio track, you might add a beep on top of the file.

Normally, that masks the audio (all you can hear is the beep), but what you may not know is that a beep is usually limited to a specific frequency. So, again, all you need to do is to open your audio editor, identify where that beep is and delete just that part of it, and then level the remaining audio back up. Now, you have the audio in its original form ... without any of the beeps.

Again, this is only possible because, in this case, the beep was added on top of the existing audio (and the audio was lowered to mask it). If they had instead replaced the original with the beep, none of this could be done.

So, whenever you work with audio and there is a section that you don't want people to hear, make sure that you completely delete that part of the audio. Don't just lower it, or mask it, or do anything else. If there are any remaining elements of the audio left, most audio editors can very quickly recreate it.

Be very careful about blacked out text

Another example is with blacked out text. This is something journalists have to work with all the time, and it's really critical that you do it right, otherwise you can bring the original text back.

If you have some documents where you don't want people to see, for instance, your source name, don't just use a black marker because even if it is blacked over, you can probably just open the image in Photoshop and adjust the brightness levels and reveal it again.

It's sometimes the same with digital apps. Some 'marker' apps don't do a full black marker, they do a marker with a bit of transparency, meaning it doesn't actually hide anything.

So you think it's blacked out, but it isn't.

It's the same thing with PDF files. Many apps that allow you to do things with PDFs, like drawing a completely black box on top of an address you don't want people to see ... can often be removed simply by opening the PDF up as 'open images' ... or worse, it can be deconstructed to extract the text only making all those black boxes you added meaningless.

Here is a simple example. This PDF has a number of black boxes added to it, but if I open it up on my editing program, I can simply select these boxes and delete them to reveal the text underneath.

In other words, when you work with blacked out boxes, you need to save the result as a new image (flattened). Don't just export it as a PDF.

Today's cameras are huge!

Speaking of images, one of the things I often come across is journalists taking pictures of their desks (maybe someone gave them a cake?) The problem is that today's mobile phones have absolutely crazy big cameras, and while the picture looks tiny on Instagram, you can often just download the original resolution.

The problem of this is that, as journalists, we often have things around our desks that we don't want people to see. But now they can just zoom in on your high-quality picture.

Here is a simple example:

Here is a picture I took of my (old) home office. At first glance, you can't see anything. The text on the screen is too tiny to make out.

But, let's zoom in.

At first, it still looks too blurry, but the more you look at it, the more of it you can start to make out. For instance, you can read that it starts with: "The reason for this is obvious."

Now, in this case, there is nothing secret about this. I was just writing one of my articles. But imagine if it had been Outlook that I had open on my screen. Or imagine if you had a post-it note with someone's contact information, or anything else.

Today's mobile cameras are so good that people can see way more than you think.

More than that, often the revealing information is sometimes things you don't realize. Remember that there are many places on your computer that can reveal information that you might not want others to see. Like the apps started in the taskbar, the title of those apps, or if you have a browser open, the name of the tab, etc.

Here is an example from a politician from a few years back. He was complaining about something on Yahoo, and had taken a screenshot to prove his case ... but look at the other browser tabs... right?

But, as journalists, we face this problem. Most journalists have tons of browser tabs open all the time, and while the contents might not be as embarrassing as this, what if instead it showed the chat session title with one of your sources?

So, do not share screenshots of your screen unless you are absolutely sure it's safe to show. This goes for Zoom or Microsoft Teams too. If you share your screen .... oh boy... that can be bad!

And remember, sometimes you don't know when it's happening. Most people today, including journalists, have notifications turned on on their computers. And so if you are sharing your screen, and everything on it is absolutely safe to show, at any moment your computer can decide to show you a notification about something that you really did not want other people to see (like the name of a source, or a document that was just shared with you).

Screen sharing is a journalistic nightmare!

Hidden tracking information

Another really big danger, when it comes to protecting your sources, is that it's really easy to embed hidden trackers into the documents that a source might share with you.

Obviously, if a source sends you a link to a website ... I mean, the amount of tracking that can be done with that is immense. Most newspapers already know about this, and they would never communicate with a source via something as unsafe as Twitter DMs or similar.

But the tracking doesn't even have to be digital.

For instance, never post screenshots of documents you have received from whistleblowers. The reason being is that it's so incredibly easy to embed hidden tracers into some text that can be individualized like I just did here. Notice the double space?

My guess is that you didn't, but it is trivially easy to add tiny unique variations to a document, so that you can precisely identify who leaked it afterwards. It can be a space, or anything else. Maybe a document sent to one executive is using Oxford commas in one paragraph, where the document sent to another executive isn't.

You will never notice this when you read the text, but it instantly reveals your source. So, never, ever, share any documents that a source has provided you with.

Also, remember the metadata. Metadata is data that is embedded into a file but that you cannot normally see. This metadata can contain a lot more information than you think.

You can do image watermarks that are completely invisible to the naked eye, but can contain the precise data to reveal your sources. But most times, the meta is just plainly visible. All you have to do is to look it up.

This could be things like the "author" of a document or who last modified it (which might reveal your source to the public), the original path where the document is saved on your source's internal company network (revealing their username), or all kinds of other information.

Also remember that many modern word processors have version control, meaning the document has all the revisions made to it stored within. This can potentially reveal so much problematic information, including early notes used when first drafting the documents, as well as comments made by different team members when discussing what to write.

None of this is visible in the final document, but with revision control, you can just go back and see it. That's really problematic.

Geolocation

Finally, there are a lot of ways that people can identify where you are, which might be problematic if you are working on a story involving a source.

Let me show you something scary. Back in August, I took this picture out of my window (from my old apartment, I have since moved), and in just a minute, a person used Google Maps to precisely identify where that is.

Now, I didn't feel particularly threatened by this (I don't think he meant any harm), but when I shared it with some of my female acquaintances, they really didn't like this. And as journalists, this is even more scary considering the threats we often face.

But also, notice that the image I shared contained no identifying marks. There were no signs, or anything. But, by seeing the name of the Fire Department (identifying my city), and the shape of buildings, roads, etc. ... that was enough.

And keep in mind, it doesn't even have to be a picture out of your window. Take a picture like this one. This, again, was my old home office from before I moved. It's just my desk... right? Except, you can also see what is outside the window, and, as you can see above, this is enough to precisely identify where I lived ... in just a minute of work on Google Maps.

Of course, we can use the same techniques the other way around. In fact, publishers like Bellingcat are famous for using these techniques to identify the exact location of news events.

So, it goes both ways. Every one of these examples can be used by journalists and newsrooms to reveal information that others don't want us to see or know, but it can also be used against us.

And the mistake I see most often is when we think we have protected ourselves or our sources, but reveal it anyway because we don't realize all the ways this can be done.

As I mentioned in the beginning, I was reminded of this because I came across a journalist who had done just this. They shared a picture that had nothing to do with anything, but it contained information in a small part of it that revealed a contact.

So, be very mindful of these things. It is easy to manage the big stuff, like only talking with sources via secure channels. It's hard to not make all of these other mistakes because, at the time, they look completely innocent.


Yet another example of why ChatGPT cannot be used for journalism

The Washington Post published an article about yet another area where ChatGPT goes horribly wrong, this time it's about math. As they say: "ChatGPT Needs Some Help With Math Assignments: 'Large language models' supply grammatically correct answers but struggle with calculations."

This should not come as a surprise to you if you have been reading my articles about AIs recently. Almost all of these new AI systems are really good at 'making shit up', but they fundamentally have no concept of what is a fact or not.

As the Post writes: "It isn't hard, and in fact a little entertainment, to feed the bot questions to which it responds with confident nonsense".

(Note: The correct answer is "23 pieces of fruit. 14 bananas and 9 oranges").

But this isn't just about math. Tools like ChatGPT are bad at everything fact-based. The best analogy I can give you is to think about essay assignments in school, and compare language class with your history class.

When I was in school, our language teacher would often give us a picture of a painting, and then we would be asked to write an essay about it. Of course, as students, we had no idea what the painting was (nor did the teacher). It was just a (boring) picture of something, but that didn't matter either.

The assignment was not about understanding the painting, instead it was a test to see how good you were at writing. And the students who did the best writing, with the best grammar, sentence structure, and general flow were ones who got the best grade. What we wrote was irrelevant, as long as it was written well.

Compare this, however, to your history class. Now, if you were handed a picture of a painting and were asked to write an essay about it, the assignment was completely different. Now what matters is whether you know who painted it, when it was painted, what style it is in, and other historical and societal elements that influenced it.

It's a history test, not a writing test.

Think about this difference between language class and history class. This is where ChatGPT (and so many AIs) go wrong. They are really good at language class. They can write an endless amount of essays with perfect sentence structure and grammar, all sounding very convincing. But they don't know anything, so they fail at anything where you have to be factual.

This is vitally important to understand for journalists and editors. As journalists, our job is not to be good at writing. It's a plus if we are, but that is secondary to our ability to define facts. In other words, as journalists, we need to be really good at history (or math) class, not language class.

ChatGPT (and most other AI tools like it) is fundamentally not a tool that can be used for journalism. At its very core, it simply is not designed for it. We are not talking about just adjusting a few parameters here. We are talking about the very essence of what it does.

As the Washington Post also wrote:

ChatGPT's struggle with math is inherent in this type of artificial intelligence, known as a large language model. It scans enormous reams of text from across the web and develops a model about what words are likely to follow others in a sentence. It's a more sophisticated version of autocomplete that, after you type "I want to" on your device, guesses the next words are "dance with somebody," "know what love is", or "be with you everywhere.

"A Mad Libs-proficient supercomputer might be extremely effective for writing grammatically correct responses to essay prompts, but not for solving a math problem. That is the Achilles' heel of ChatGPT: It responds in an authoritative-sounding language with numbers that are grammatically correct and mathematically wrong".

My very simple advice to you right now is this: Do not use ChatGPT or similar tools for journalism.


A guide to using AI for publishers

Of course, this doesn't mean that we shouldn't use AIs at all in journalism. There is a big future for AI for us, but it's very different, and it requires a different approach.

If you want to learn more about this, I recently published: "A guide to using AI for publishers".


Support this focus

Also, remember that while this newsletter is free for anyone to read, it's paid for by my subscribers to Baekdal Plus. So if you want to support this type of analysis and advice, subscribe to Baekdal Plus, which will also give you access to all my Plus reports (more than 300), and all the new ones (about 25 reports per year).

This is an archived version of a Baekdal Plus newsletter (it's free). It is sent out about once per week and features the latest articles as well as unique insights written specifically for the newsletter. If you want to get the next one, don't hesitate to add your email to the list.

 
 
 

The Baekdal Plus Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé

 

—   newsletter   —

free

newsletter:
The Audience Relevance Model

plus

newsletter:
The future outlook of the brand+publisher market

free

newsletter:
Can magazines mix advertising and subscription? And what about password sharing?

free

newsletter:
What happens when you ask an AI to do media analysis?

free

newsletter:
Operational security and the dangers of online sharing for journalists

free

newsletter:
How to think about AI for publishers, and the end of the million views