Saturday, 04 April


Not Just 'The Death of IT'. Cringely Also Predicts Layoffs For Many IT Contractors [Slashdot]

Last week long-time tech pundit Robert Cringely predicted "the death of IT" in 2020 due to the widespread adoption of SD-WAN and SASE. Now he's predicting "an even bigger bloodbath as IT employees at all levels are let go forever," including IT consultants and contractors. My IT labor death scenario now extends to process experts (generally consultants) being replaced with automation. In a software-defined network, whether that's SD-WAN or SASE, so much of what used to be getting discreet boxes to talk with one another over the network becomes a simple database adjustment. The objective, in case anyone forgets (as IT, itself, often does) is the improvement of the end-user experience, in this case through an automated process. With SD-WAN, for example, there are over 3,000 available Quality of Service metrics. You can say that Office 365 is a critical metric as just one example. Write a script to that effect into the SD-WAN database, deploy it globally with a keyclick and you are done... It's slowly dawning on IBM [and its competitors] that they have to get rid of all those process experts and replace them with a few subject matter experts. Here's the big lesson: with SD-WAN and SASE the process no longer matters, so knowing the process (beyond a few silverbacks kept on just in case the world really does end) isn't good for business. Cringely predicts the downgrading of corporate bonds will also put pressure on IBM and its competitors, perhaps ultimately leading to a sale or spin-off at IBM. "Either they sell the parts that don't make money, which is to say everything except Red Hat and mainframes, or they sell the whole darned thing, which is what I expect to happen." With that he predicts thousands of layoffs or furloughs — and while the bond market puts IBM in a bigger bind, "this could apply in varying degrees to any IBM competitors."

Read more of this story at Slashdot.


Saturday Morning Breakfast Cereal - Foraging [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Thanks to the Patreon Squad for helping me fix an earlier version of this!

Today's News:


Y Combinator Company 'Flexport' Is Shipping PPE To Frontline Responders [Slashdot]

The Y Combinator company Flexport is a San Francisco-based freight-forwarding and customs brokerage company. (Its investors include Google Ventures and Peter Thiel's Founders Fund.) But on March 23rd Flexport announced they were now re-focusing all their resources to get critical supplies to frontline responders combating COVID-19. They've joined a team that announced on Friday announced "we're shipping full cargo planes filled with PPE to protect frontline responders," citing a partnership with Atlas Air and United Airlines. Atlas Air delivered a dedicated charter plane for this mission on Thursday, April 2nd. Originating in Shanghai, the plane contained over 143,000 pounds of PPE for medical systems in California, including approximately: - 4,500,000 medical masks - 116,000 disposable medical protection coveralls - 121,300 surgical gowns For this volume of goods, significant capacity is needed on a plane. However, global travel has plunged because of the outbreak, meaning that passenger planes which used to carry cargo are grounded, and the air market capacity is extremely limited. And hospitals, who in normal situations aren't importing their own goods, can't arrange cargo on a plane on their own... Crews from United Airlines volunteered to help, arriving at SFO [San Francisco International Airport] at 6AM to unload and unpack the plane. The cargo was then put on a truck and delivered directly to hospitals that will distribute the PPE across the state based on need... Up next, we're moving cargo to New York and will share updates next week. Please continue to help us spread the word to support the response efforts. They're raising money on GoFundMe, and this "Frontline Responders Fund" has so far raised over $6 million from 15,800 donors. Their page notes that on Thursday former California governor Arnold Schwarzenegger "personally helped us deliver a trucking shipment from MedShare with 49,000 donated masks to a hospital in Los Angeles, California." Their page also notes donations have funded the trucking of goods across America from nonprofits, including: All Hands and All Hearts Smart Response, who delivered over 43,000 units of gloves, gowns, face masks, goggles, and hand sanitizer to emergency rooms and hospitals in New York City and Southern California. Donate PPE, who delivered over 3,750 N95 respirator masks to hospitals in Brooklyn, NY yesterday One of their supporters is actor Clark Gregg, who plays agent Coulson in five Marvel movies and the TV series Agents of S.H.I.E.L.D. He records personalized video greetings for fans through a web site called Cameo, and through Wednesday he donated 100% of the money earned to the Frontline Responders Fund.

Read more of this story at Slashdot.


A Hacker Found a Way To Take Over Any Apple Webcam [Slashdot]

An anonymous reader quotes a report from Wired: Apple has a well-earned reputation for security, but in recent years its Safari browser has had its share of missteps. This week, a security researcher publicly shared new findings about vulnerabilities that would have allowed an attacker to exploit three Safari bugs in succession and take over a target's webcam and microphone on iOS and macOS devices. Apple patched the vulnerabilities in January and March updates. But before the fixes, all a victim would have needed to do is click one malicious link and an attacker would have been able to spy on them remotely. The bugs Pickren found all stem from seemingly minor oversights. For example, he discovered that Safari's list of the permissions a user has granted to websites treated all sorts of URL variations as being part of the same site, like, and fake:// By "wiggling around," as Pickren puts it, he was able to generate specially crafted URLs that could work with scripts embedded in a malicious site to launch the bait-and-switch that would trick Safari. A hacker who tricked a victim into clicking their malicious link would be able to quietly launch the target's webcam and microphone to capture video, take photos, or record audio. And the attack would work on iPhones, iPads, and Macs alike. None of the flaws are in Apple's microphone and webcam protections themselves, or even in Safari's defenses that keep malicious sites from accessing the sensors. Instead, the attack surmounts all of these barriers just by generating a convincing disguise.

Read more of this story at Slashdot.


Dell XPS Ice Lake Taking A Wallop On Ubuntu 20.04 [Phoronix]

With our early benchmarking of Ubuntu 20.04 in its current nearing the end of development state, we've been seeing Ubuntu 20.04 boosting Intel Xeon Scalable performance, running well with AMD EPYC Rome, and good AMD Ryzen performance, among other tests. Strangely though the one platform where I've found Ubuntu 20.04 hard regressing so far is with the Dell XPS 7390 Ice Lake...


Radeon Open Compute 3.3 Released But Still Without Official Navi Support [Phoronix]

This week marked the release of ROCm 3.3 as the newest version of the Radeon Open Compute stack...


High Resolution Wheel Scrolling Back To Being Finished Up For The Linux Desktop [Phoronix]

Added over a year ago to the mainline Linux kernel was the high resolution mouse wheel scrolling support. While the support landed on kernel-side for to provide "buttery smooth" wheel scrolling, the work has yet to be wrapped up on the user-space side for making this a reality on the Linux desktop...


Zoom Will Enable Waiting Rooms By Default To Stop Zoombombing [Slashdot]

Zoom is making some much-needed changes to prevent "Zoombombing," a term used to describe when someone successfully invades a public or private meeting over the videoconferencing platform to broadcast shock videos, pornography, or other disruptive content. The act was recently mentioned on the Department of Justice's website, warning that users who engage in this sort of video hacking could face fines and possible imprisonment. TechCrunch reports: Starting April 5th, it will require passwords to enter calls via Meeting ID, as these may be guessed or reused. Meanwhile, it will change virtual waiting rooms to be on by default so hosts have to manually admit attendees. [...] Zoom CEO Eric Yuan apologized for the security failures this week and vowed changes. But at the time, the company merely said it would default to making screensharing host-only and keeping waiting rooms on for its K-12 education users. Clearly it determined that wasn't sufficient, so now waiting rooms are on by default for everyone. Zoom communicated the changes to users via an email sent this afternoon that explains "we've chosen to enable passwords on your meetings and turn on Waiting Rooms by default as additional security enhancements to protect your privacy." The company also explained that "For meetings scheduled moving forward, the meeting password can be found in the invitation. For instant meetings, the password will be displayed in the Zoom client. The password can also be found in the meeting join URL." Some other precautions users can take include disabling file transfer, screensharing or rejoining by removed attendees.

Read more of this story at Slashdot.


AMD ACO Backend Implements 8-bit / 16-bit Storage Capabilities - Needed For DOOM Eternal [Phoronix]

It's been another busy week for Mesa's RADV Vulkan driver with the Valve-backed ACO compiler back-end alternative to AMDGPU LLVM...


Watch: Rare Second World War footage of Bletchley Park-linked MI6 intelligence heroes emerges, shared online [The Register]

A glimpse of life at Whaddon Hall

Vid  An astonishingly rare film documenting British intelligence personnel, linked to the code-breakers at Bletchley Park, has been released by the park's trust, offering a glimpse of unsung heroes who helped win the Second World War.…


Vint Cerf 'No Longer Contagious' With Covid-19 [Slashdot]

DevNull127 writes: Good news — VA Public Health has certified my wife and me as no longer contagious with COVID19," tweeted 76-year-old Vint Cerf, one of the creators of the modern internet. He added one word. "Recovering!" It seemed especially appropriate that Cerf shared his news online — and that it drew positive responses from grateful people around the world, including several who use the internet in their daily lives. Cerf's tweet immediately drew positive responses from the Internet Society, as well as the chief operating officer of the Cloud Native Computing Foundation, YouTube's director of public policy, and a senior director of communications and public affairs at Google. There were also congratulatory posts from a Georgetown professor of technology and law, from Associated Press reporter Frank Bajak, and the executive director of the Global Privacy and Security by Design Centre. Cerf followed up his news with a re-tweet of Google's "Community Mobility Reports" charting our aggregate movement trends over time, and a tweet of a University of Pittsburgh press release about progress on a COVID-19 vaccine candidate. Earlier in the week Cerf also re-tweeted a humorous compilation of clips from the TV show M*A*S*H that illustrated safe practices while social distancing.

Read more of this story at Slashdot.

Friday, 03 April


Steam Survey Points To Tiny Uptick In Linux Percentage For March [Phoronix]

With Steam and other online gaming platforms seeing record usage in recent weeks as a result of home isolation around the world as a result of the coronavirus, one of the matters of curiosity has been how this will impact the Linux gaming percentage...


Linux 5.7's Char/Misc Brings MHI Bus, Habana Labs AI Accelerator Code Additions [Phoronix]

Greg Kroah-Hartman on Friday sent in his "char/misc" updates for the Linux 5.7 kernel several days later than normal...


Physical Force Alone Spurs Gene Expressions, Study Reveals [Slashdot]

An anonymous reader quotes a report from Phys.Org: Cells will ramp up gene expression in response to physical forces alone, a new study finds. Gene activation, the first step of protein production, starts less than one millisecond after a cell is stretched -- hundreds of times faster than chemical signals can travel, the researchers report. The scientists tested forces that are biologically relevant -- equivalent to those exerted on human cells by breathing, exercising or vocalizing. They report their findings in the journal Science Advances. In the new work, the researchers observed that special DNA-associated proteins called histones played a central role in whether gene expression increased in response to forces that stretched the cell. Histones regulate DNA, winding it up to package it in the nucleus of the cell. One class of histones, known as Histone H3, appear to prevent force-responsive gene expression when methylated at an amino acid known as lysine 9. Methylation involves adding a molecular tag known as a methyl group to a molecule. The scientists observed that H3K9 methylation was highest at the periphery of the nucleus and largely absent from the interior, making the genes in the interior more responsive to stretching. The researchers found they could suppress or boost force-responsive gene expression by increasing or decreasing H3K9 histone methylation. The scientists also tested whether the frequency of an applied force influenced gene expression. They found that cells were most responsive to forces with frequencies up to about 10-20 hertz.

Read more of this story at Slashdot.


FCC: TracFone Made Up 'Fictitious' Customers To Defraud Low-Income Program [Slashdot]

TracFone Wireless is facing a potential $6 million fine for allegedly defrauding a government program that provides discount telecom service to poor people. Ars Technica: The Federal Communications Commission proposed the fine against TracFone yesterday, saying the prepaid wireless provider obtained FCC Lifeline funding by "enroll[ing] fictitious subscriber accounts." TracFone improperly sought and received more than $1 million from Lifeline, the FCC said. The FCC press release said: "TracFone's sales agents -- who were apparently compensated via commissions for new enrollments -- apparently manipulated the eligibility information of existing subscribers to create and enroll fictitious subscriber accounts. For example, TracFone claimed support for seven customers in Florida at different addresses using the same name, all seven of whom had birth dates in July 1978 and shared the same last four Social Security Number digits. The Enforcement Bureau's investigation also found that, in 2018, TracFone apparently sought reimbursement for thousands of ineligible subscribers in Texas. Today's proposed fine is based on the 5,738 apparently improper claims for funding that TracFone made in June 2018 and includes an upward adjustment in light of the company's egregious conduct in Florida." TracFone said it would respond "at the appropriate time" in an effort to reduce or eliminate the proposed fine. The company also said "we take seriously our stewardship of public dollars and will continue to focus on connecting millions of low-income customers to school, jobs, healthcare, and essential social services," according to Reuters.

Read more of this story at Slashdot.


SpaceX Loses Its Third Starship Prototype During a Cryogenic Test [Slashdot]

For a third time, SpaceX lost one of its Starship prototype spacecrafts during a pressure test at the company's test site in Boca Chica, Texas. Ars Technica reports: This week, SpaceX workers in South Texas loaded the third full-scale Starship prototype -- SN3 -- onto a test stand at the company's Boca Chica launch site. On Wednesday night, they pressure-tested the vehicle at ambient temperature with nitrogen, and SN3 performed fine. On Thursday night SpaceX began cryo-testing the vehicle, which means it was loaded again with nitrogen, but this time it was chilled to flight-like temperatures and put under flight-like pressures. Unfortunately, a little after 2am local time, SN3 failed and began to collapse on top of itself. It appeared as if the vehicle may have lost pressurization and become top-heavy. Multiple sources indicated that had these preliminary tests succeeded, SN3 would have attempted a 150-meter flight test as early as next Tuesday. SpaceX founder Elon Musk said on Twitter: "We will see what data review says in the morning, but this may have been a test configuration mistake." A testing issue would be good in the sense that it means the vehicle itself performed well, and the problem can be more easily addressed.

Read more of this story at Slashdot.


Amazon To Delay Marketing Event Prime Day Due To Coronavirus [Slashdot]

An anonymous reader shares a report: Amazon is postponing its major summer shopping event Prime Day at least until August and expects potentially a $100 million hit from excess devices it may now have to sell at a discount, according to internal meeting notes seen by Reuters.

Read more of this story at Slashdot.


Twitter Removes 9,000 Accounts Pushing Coronavirus Propaganda Praising the United Arab Emirates [Slashdot]

An anonymous reader quotes a report from BuzzFeed News: On April 2, Twitter took down a pro-United Arab Emirates network of accounts that was pushing propaganda about the coronavirus pandemic and criticizing Turkey's military intervention in Libya. Previously tied to marketing firms in the region, parts of this network were removed by Facebook and Twitter last year. The network was made up of roughly 9,000 accounts, according to disinformation research firm DFRLab and independent researcher Josh Russell. Although it promoted narratives in line with the political stances of the governments of the UAE, Saudi Arabia, and Egypt, its origins were unclear. Many Twitter handles contained alphanumeric characters instead of names, and many did not post photos. Accounts that did have profile pictures often used images of Indian models. One video pushed by the fake accounts voiced support for the Chinese government during the peak of the coronavirus outbreak in China in February. The video remains online, but lost over 4,000 retweets and likes after the takedown. The video now has four retweets. The bot network also amplified a video of a woman thanking the government of the UAE for transporting Yemeni students out of Wuhan, China. Today, that video, which is also still online, went from having nearly 4,500 retweets to having 70. Spreading propaganda about the coronavirus didn't seem to have been the network's focus. The accounts, some of which posed as journalists and news outlets, amplified an article about the UAE government's disapproval of the Libyan prime minister and boosted criticism of Turkey's support of militias in Libya.

Read more of this story at Slashdot.


Potential Vaccine Generates Enough Antibodies To Fight Off Virus [Slashdot]

Slashdot readers schwit1 and Futurepower(R) are sharing news about a potential coronavirus vaccine that has been found to produce antibodies capable of fighting off Covid-19. The Independent reports: The vaccine, which was tested on mice by researchers at the University of Pittsburgh School of Medicine, generated the antibodies in quantities thought to be enough to "neutralize" the virus within two weeks of injection. The study's authors are now set to apply to the U.S. Food and Drug Administration for investigational new drug approval ahead of phase one human clinical trials planned to start in the next few months. [T]he Pittsburgh research is the first study on a Covid-19 vaccine candidate to be published after review from fellow scientists at outside institutions. The scientists were able to act quickly because they had already laid the groundwork during earlier epidemics of coronaviruses: Sars in 2003 and Mers in 2014. What's also neat about this potential vaccine is that it can sit at room temperature until it is needed and be scaled up to produce the protein on an industrial scale. The fingertip-sized patch of 400 tiny microneedles "inject the spike protein pieces into the skin, where the immune reaction is strongest," the report says. "The patch is stuck on like a plaster and the needles -- which are made entirely of sugar and the protein pieces -- simply dissolve into the skin." While long-term testing is still required, "the mice who were given the Pittsburgh researchers' Mers vaccine candidate developed enough antibodies to neutralize the virus for at least a year," reports The Independent. "The antibody levels of the rodents vaccinated against Covid-19 'seem to be following the same trend,' according to the researchers."

Read more of this story at Slashdot.


Not only is Zoom's strong end-to-end encryption not actually end-to-end, its encryption isn't even that strong [The Register]

Video calls also routed through China, probe discovers

Updated  Zoom has faced increased scrutiny and criticism as its usage soared from 10 million users a day to 200 million in a matter of months, all thanks to coronavirus pandemic lockdowns.…


'Zoombombing' Is a Federal Offense That Could Result In Imprisonment, Prosecutors Warn [Slashdot]

"Zoomboming," where someone successfully invades a public or private meeting over the videoconferencing platform to broadcast shock videos, pornography, or other disruptive content, could result in fines and possible imprisonment, according to federal prosecutors. The Verge reports: The warning was posted as a press release to the Department of Justice's website under the U.S. Attorney's office for the state's Eastern district with support from the state attorney general and the FBI. Now, prosecutors say they'll pursue charges for Zoombombing, including "disrupting a public meeting, computer intrusion, using a computer to commit a crime, hate crimes, fraud, or transmitting threatening communications." Some of the charges include fines and possible imprisonment. The press release says that if you or anyone you know becomes a victim of teleconference hacking, they can report it to the FBI's Internet Crime Complaint Center. "Do not make the meetings or classroom public. In Zoom, there are two options to make a meeting private: require a meeting password or use the waiting room feature and control the admittance of guest," the guidance reads. "Do not share a link to a teleconference or classroom on an unrestricted publicly available social media post. Provide the link directly to specific people." The Verge adds: "The guidance also advises against allowing anyone but the host to screenshare and asks that users of Zoom and other apps install the latest updates."

Read more of this story at Slashdot.


Broadband Engineers Threatened Due To 5G Coronavirus Conspiracies [Slashdot]

An anonymous reader quotes a report from The Guardian: Telecoms engineers are facing verbal and physical threats during the lockdown, as baseless conspiracy theories linking coronavirus to the roll-out of 5G technology spread by celebrities such as Amanda Holden prompt members of the public to abuse those maintaining vital mobile phone and broadband networks. Facebook has removed one anti-5G group in which users were being encouraged to supply footage of them destroying mobile phone equipment, with some contributors seemingly under the pretense that it may stop the spread of coronavirus and some running leaderboards of where equipment had been targeted. Video footage of a 70ft (20 meter) telephone mast on fire in Birmingham this week has also circulated widely alongside claims it was targeted by anti-5G protesters. Network operator EE told the Guardian that its engineers were still on site assessing the cause of the fire but it "looks likely at this time" that it was an arson attack. The company said it would be working with the police to find the culprits. The problem has become so bad that engineers working for BT Openreach, which provides home broadband services, have also taken to posting public pleas on anti-5G Facebook groups asking to be spared the on-street abuse as they are not involved in maintaining mobile networks. Industry lobby group Mobile UK said the incidents were affecting efforts to maintain networks that are supporting home working and providing critical connectivity to the emergency services, vulnerable consumers and hospitals. Telecoms engineers are considered key workers under the government's guidelines.

Read more of this story at Slashdot.


EU Rules Rental Car Companies Don't Need To Pay A License To Rent Cars With Radios That Might Play Music [Slashdot]

Mike Masnick, reporting at TechDirt: Five years ago, we wrote about another such crazy demand -- a PRO (Performance Rights Organizations (PROs), sometimes known as "Collection Societies," that have a long history of demanding licensing for just about every damn thing) in Sweden demanding that rental car companies pay a performance license because their cars had radios, and since "the public" could rent their cards and listen to the radio, that constituted "a communication to the public" that required a separate license. The case has bounced around the courts, and finally up to the Court of Justice for the EU which has now, finally, ruled that merely renting cars does not constitute "communication to the public."

Read more of this story at Slashdot.


Proton 5.0-6 To Allow Out-Of-The-Box DOOM Eternal On Linux [Phoronix]

Valve is finishing up work on Proton 5.0-6 as the next version of their Wine downstream that powers Steam Play. With Proton 5.0-6 are some promising improvements...


Trump: CDC Recommends Cloth Face Covering To Protect Against Coronavirus [Slashdot]

President Trump says the CDC now recommends using a cloth face covering to protect against coronavirus, but said he does not plan to do so himself. CNBC reports: Trump stressed that the recommendations were merely voluntary, not required. "I don't think I'm going to be doing it" he said as he announced the new guidance. The CDC's website explained that the recommendations were updated following new studies that some infected people can transmit the coronavirus even without displaying symptoms of the disease. "In light of this new evidence, CDC recommends wearing cloth face coverings in public settings where other social distancing measures are difficult to maintain," such as in grocery stores or pharmacies, "especially in areas of significant community-based transmission," the CDC says. Developing...

Read more of this story at Slashdot.


NSO Group: Facebook tried to license our spyware to snoop on its own addicts – the same spyware it's suing us over [The Register]

Antisocial network sought surveillance tech to boost its creepy Onavo Protect app, it is claimed

NSO Group – sued by Facebook for developing Pegasus spyware that targeted WhatsApp users – this week claimed Facebook tried to license the very same surveillance software to snoop on its own social-media addicts.…


Apple Brings Its Hardware Microphone Disconnect Feature To iPads [Slashdot]

Apple has brought its hardware microphone disconnect security feature to its latest iPads. From a report: The microphone disconnect security feature aims to make it far more difficult for hackers to use malware or a malicious app to eavesdrop on a device's surroundings. The feature was first introduced to Macs by way of Apple's T2 security chip last year. The security chip ensured that the microphone was physically disconnected from the device when the user shuts their MacBook lid. The idea goes that physically cutting off the microphone from the device prevents malware -- even with the highest level of âoerootâ device permissions -- from listening in to nearby conversations. Apple confirmed in a support guide that its newest iPads have the same feature. Any certified "Made for iPad" case that's attached and closed will trigger the hardware disconnect.

Read more of this story at Slashdot.


NIR Vectorization Lands In Mesa 20.1 For Big Intel Graphics Performance Boost [Phoronix]

The recently covered NIR vectorization pass ported from AMD's ACO back-end for improving the open-source Intel Linux graphics performance has landed now in Mesa 20.1...


Oracle teases prospect of playing nicely with open-source Java in update to WebLogic application server [The Register]

'Low cost of ownership'? This must be an April Fools

Oracle has chosen this week of all weeks to foist on the world an update of its application server WebLogic, festooned with new features addressing Java EE 8, Kubernetes and JSON.…


How Did Covid-19 Begin? WaPo OpEd Calls Its Origin Story 'Shaky' [Slashdot]

The story of how the novel coronavirus emerged in Wuhan, China, has produced a nasty propaganda battle between the United States and China. Columnist David Ignatius writes in an opinion piece for The Washington Post: The two sides have traded some of the sharpest charges made between two nations since the Soviet Union in 1985 falsely accused the CIA of manufacturing AIDS. U.S. intelligence officials don't think the pandemic was caused by deliberate wrongdoing. The outbreak that has now swept the world instead began with a simpler story, albeit one with tragic consequences: The prime suspect is "natural" transmission from bats to humans, perhaps through unsanitary markets. But scientists don't rule out that an accident at a research laboratory in Wuhan might have spread a deadly bat virus that had been collected for scientific study. "Good science, bad safety" is how Sen. Tom Cotton (R-Ark.) put this theory in a Feb. 16 tweet. He ranked such a breach (or natural transmission) as more likely than two extreme possibilities: an accidental leak of an "engineered bioweapon" or a "deliberate release." Cotton's earlier loose talk about bioweapons set off a furor, back when he first raised it in late January and called the outbreak "worse than Chernobyl." Important note: "U.S. intelligence officials think there's no evidence whatsoever that the coronavirus was created in a laboratory as a potential bioweapon. Solid scientific research demonstrates that the virus wasn't engineered by humans and that it originated in bats." In February the Post also quoted Vipin Narang, an associate professor at the Massachusetts Institute of Technology, as saying that it's "highly unlikely" the general population was exposed to a virus through an accident at a lab. "We don't have any evidence for that," said Narang, a political science professor with a background in chemical engineering. That article also noted that even Senator Cotton "acknowledged there is no evidence that the disease originated at the lab." "Instead, he suggested it's necessary to ask Chinese authorities about the possibility, fanning the embers of a conspiracy theory that has been repeatedly debunked by experts."

Read more of this story at Slashdot.


Things that go crump in the night: Watch Musk's mighty missile go foom [The Register]

Testing times for SpaceX as another Starship prototype implodes

Video  Yet another of SpaceX's Starship prototypes, SN3, was left in pieces last night following tank testing.…


Thousands of Zoom Video Calls Left Exposed on Open Web [Slashdot]

Thousands of personal Zoom videos have been left viewable on the open Web, highlighting the privacy risks to millions of Americans as they shift many of their personal interactions to video calls in an age of social distancing. From a report: Many of the videos appear to have been recorded through Zoom's software and saved onto separate online storage space without a password. But because Zoom names every video recording in an identical way, a simple online search can reveal a long stream of videos that anyone can download and watch. Zoom videos are not recorded by default, though call hosts can choose to save them to Zoom servers or their own computers. There's no indication that live-streamed videos or videos saved onto Zoom's servers are publicly visible. But many participants in Zoom calls may be surprised to find their faces, voices and personal information exposed because a call host can record a large group call without participants' consent.

Read more of this story at Slashdot.


Slashdot Asks: Do You Own a Gaming Console? What Titles Have You Been Playing Lately? [Slashdot]

What games have you been playing on your Xbox, or PlayStation, or Nintendo Switch -- or any other gaming machine!

Read more of this story at Slashdot.


Microsoft brings Mixed Reality toys and other improvements to 'citizen developers' using low-code Power Apps platform [The Register]

Mobile compatibility issue fixed and working with data in grids made easier

Microsoft has updated its "citizen developer" platform, Power Apps, adding Mixed Reality support, fixing a compatibility issue with the mobile app, and improving options for working with data in grids.…


Ada++ Wants To Make The Ada Programming Language More Accessible [Phoronix]

Ada is a beautiful programming language when it comes to code safety with it continuing to be used by aircraft and other safety critical systems. There is now Ada++ as an unofficial fork of the language focused on making the language more accessible and friendlier in an era of the likes of Rust and Golang attracting much interest...


WeWork Founder Misses Out on $1 Billion as SoftBank Cancels Share Buyout [Slashdot]

SoftBank is walking away from a sizeable chunk of its WeWork rescue package, which included a near billion dollar windfall for ousted founder Adam Neumann. From a report: The Japanese tech company has backed out of a plan to buy $3 billion worth of shares in the coworking startup from existing shareholders and investors, according to statements from SoftBank and a special committee of WeWork's board. SoftBank's chief legal officer, Rob Townsend, said in a statement on Thursday that the share purchase was subject to certain conditions agreed to in October. "Several of those conditions were not met, leaving SoftBank no choice but to terminate the tender offer," he said.

Read more of this story at Slashdot.


PCI Changes For Linux 5.7 Bring Error Disconnect Recover, P2P DMA For Skylake-E [Phoronix]

The PCI subsystem changes were sent out today for the ongoing Linux 5.7 kernel merge window...


Where's the best place to add Mentos to Diet Coke for the most foam? How big are the individual bubbles? Has science gone too far? [The Register]

Teachers trek high and low to uncover cola geyser secrets

Did you know that the popular Diet Coke and Mentos soda geyser experiment works better at higher altitudes? Or that the average size of the bubbles formed on the scotch mints is about 6μm? Now you do, thanks to the wonders of science and those with a bubbling passion for it.…


How Google Ruined the Internet [Slashdot]

An anonymous reader shares a column: Remember that story about the Polish dentist who pulled out all of her ex-boyfriend's teeth in an act of revenge? It was complete and utter bullshit. 100% fabricated. No one knows who wrote it. Nevertheless, it was picked up by Fox News, the Los Angeles Times and many other publishers. That was eight years ago, yet when I search now for "dentist pulled ex boyfriends teeth," I get a featured snippet that quotes ABC News' original, uncorrected story. Who invented the fidget spinner? Ask Google Assistant and it will tell you that Catherine Hettinger did: a conclusion based on poorly-reported stories from The Guardian, The New York Times and other major news outlets. Bloomberg's Joshua Brustein clearly demonstrated that Ms. Hettinger did not invent the low friction toy. Nevertheless, ask Google Assistant "who really invented the fidget spinner?" and you'll get the same answer: Catherine Hettinger. In 1998, the velocity of information was slow and the cost of publishing it was high (even on the web). Google leveraged those realities to make the best information retrieval system in the world. Today, information is free, plentiful and fast moving; somewhat by design, Google has become a card catalog that is constantly being reordered by an angry, misinformed mob. The web was supposed to forcefully challenge our opinions and push back, like a personal trainer who doesn't care how tired you say you are. Instead, Google has become like the pampering robots in WALL-E, giving us what we want at the expense of what we need. But, it's not our bodies that are turning into mush: It's our minds.

Read more of this story at Slashdot.


Amazon Exec Called Fired Worker 'Not Smart' in Leaked Memo [Slashdot]

A senior Amazon executive called a fired Staten Island warehouse worker "not smart or articulate" in internal discussions about how the company should respond to employee criticism of its handling of the pandemic, Bloomberg reported Friday. From a report: Amazon General Counsel David Zapolsky said fired worker Chris Smalls should be the focus of Amazon's public-relations campaign countering activist employees, said the person who saw an internal memo. Amazon workers around the country have been walking off the job or holding demonstrations to highlight what they describe as inadequate safety precautions. Smalls said the memo reveals that Amazon is more interested in managing its public image than protecting workers, and he called on employees to keep pressuring the company to implement stronger safeguards. "Amazon wants to make this about me, but whether Jeff Bezos likes it or not, this is about Amazon workers -- and their families -- everywhere," he said, referring to the company's chief executive officer. "There are thousands of scared workers waiting for a real plan from Amazon so that its facilities do not become epicenters of the crisis. More and more positive cases are turning up every day."

Read more of this story at Slashdot.


ESA missions back doing science after precautionary pandemic plug pull: We talk to space boffins about Mars Express emergency command line [The Register]

Meanwhile, three-quarters of NASA staff now staying at home

ESA's mission operations centre in Germany has got back to doing interplanetary science after a short stand-down due to COVID-19.…


Zoom's Encryption Is 'Not Suited for Secrets' and Has Surprising Links To China, Researchers Discover [Slashdot]

Meetings on Zoom, the increasingly popular video conferencing service, are encrypted using an algorithm with serious, well-known weaknesses, and sometimes using keys issued by servers in China, even when meeting participants are all in North America, according to researchers at the University of Toronto. From a report: The researchers also found that Zoom protects video and audio content using a home-grown encryption scheme, that there is a vulnerability in Zoom's "waiting room" feature, and that Zoom appears to have at least 700 employees in China spread across three subsidiaries. They conclude, in a report for the university's Citizen Lab -- widely followed in information security circles -- that Zoom's service is "not suited for secrets" and that it may be legally obligated to disclose encryption keys to Chinese authorities and "responsive to pressure" from them.

Read more of this story at Slashdot.


A Hacker Has Wiped, Defaced More Than 15,000 Elasticsearch Servers [Slashdot]

For the past two weeks, a hacker has been breaking into Elasticsearch servers that have been left open on the internet without a password and attempting to wipe their content, while also leaving the name of a cyber-security firm behind, trying to divert blame. From a report: According to security researcher John Wethington, one of the people who saw this campaign unfolding and who aided ZDNet in this report, the first intrusions began around March 24. The attacks appear to be carried with the help of an automated script that scans the internet for ElasticSearch systems left unprotected, connects to the databases, attempts to wipe their content, and then creates a new empty index called The attacking script doesn't appear to work in all instances, though, as the index is also present in databases where the content has been left intact.

Read more of this story at Slashdot.


Facebook Wanted NSO Spyware To Monitor Users, NSO CEO Claims [Slashdot]

Facebook representatives approached controversial surveillance vendor NSO Group to try and buy a tool that could help Facebook better monitor a subset of its users, according to an extraordinary court filing from NSO in an ongoing lawsuit. From a report: Facebook is currently suing NSO for how the hacking firm leveraged a vulnerability in WhatsApp to help governments hack users. NSO sells a product called Pegasus, which allows operators to remotely infect cell phones and lift data from them. According to a declaration from NSO CEO Shalev Hulio, two Facebook representatives approached NSO in October 2017 and asked to purchase the right to use certain capabilities of Pegasus. At the time, Facebook was in the early stages of deploying a VPN product called Onavo Protect, which, unbeknownst to some users, analyzed the web traffic of users who downloaded it to see what other apps they were using. According to the court documents, it seems the Facebook representatives were not interested in buying parts of Pegasus as a hacking tool to remotely break into phones, but more as a way to more effectively monitor phones of users who had already installed Onavo.

Read more of this story at Slashdot.


Saturday Morning Breakfast Cereal - The Point [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Later he finds his true calling being forgotten about at the end of lines of code.

Today's News:


Motorola casually trots out third UK release in as many months: This time it's a 'Lite' take on the Moto G8 Power [The Register]

No MWC, no problem

Hello again, Moto. In the past month, the Lenovo-owned mobile maker has announced three new smartphones for the UK market. The latest is the Moto G8 Power Lite, which retails at £149.99, and offers a surprising amount of battery life for your buck.…


Google Is Publishing Location Data From 131 Countries To Show How Coronavirus Lockdowns Are Working [Slashdot]

Google is using the location data it collects from billions of smartphones to show how people's movements have changed as governments around the world lock down cities and issue shelter in place orders to slow down the spread of the coronavirus. From a report: Reports generated using this data, which is normally used to show how busy a certain location is on Google Maps, and which Google says does not identify individual people, are freely available on a brand new website called COVID-19 Community Mobility Reports. "We have heard from public health officials that this same type of aggregated, anonymized data could be helpful as they make critical decisions to combat COVID-19," wrote Google senior vice president Jen Fitzpatrick and Karen DeSalvo, chief health officer for Google Health, in a blog post published Friday. The data is currently available for 131 countries, and in many locations including the US, you can also access data for individual counties.

Read more of this story at Slashdot.


Biz software pusher IFS goes a bit Minority Report with augmented-reality repair suite [The Register]

But it's still playing catch-up with big boys SAP and Oracle

ERP flinger IFS is inflicting more augmented reality on the unsuspecting world of repair and maintenance as it strives to catch up with Oracle and SAP.…


Disney+ Launches in India For $20 a Year, Includes Shows From HBO, Showtime, and Live TV Channels [Slashdot]

Disney+ has arrived in India through Hotstar, a popular on-demand video streamer the giant conglomerate picked up as part of the Fox deal. From a report: To court users in India, the largest open entertainment market in Asia, Disney is charging users 1,499 Indian rupees (about $19.5) for a year, the most affordable plan in any of the more than a dozen markets where Disney+ is currently available. Subscribers of the revamped streaming service, now called Disney+ Hotstar, will get access to Disney Originals in English as well as several local languages, live sporting events, dozens of TV channels, and thousands of movies and shows, including some sourced from HBO, Showtime, ABC and Fox that maintain syndication partnerships with the Indian streaming service. It also maintains partnership with Hooq -- at least for now. Unlike Disney+'s offering in the U.S. and other markets, in India, the service does not support 4K and streams content at nearly a tenth of their bitrate.

Read more of this story at Slashdot.


LLVM Lands Performance-Hitting Mitigation For Intel LVI Vulnerability [Phoronix]

Made public in March was the Load Value Injection (LVI) attack affecting Intel CPUs with SGX capabilities. LVI combines Spectre-style code gadgets with Meltdown-type illegal data flows to bypass existing defenses and allow injecting data into a victim's transient execution. While mitigations on the GNU side quickly landed, the LLVM compiler mitigations were just merged today.


Ubuntu 20.04 LTS Beta Released [Phoronix]

For those with extra time on their hands due to being at home and social distancing, Canonical released the Ubuntu 20.04 LTS beta today for testing...


Cabinet Office dangles £15m for help ditching its Single Operating Platform for cloud-based ERP system [The Register]

Project 'SOP2SaaS'... it just rolls off the tongue

The Cabinet Office is offering a £15m contract for a consultancy to help it shift central government enterprise applications to an as software-as-a-service delivery model, part of an ambitious refresh programme.…


Scientists Develop AI That Can Turn Brain Activity Into Text [Slashdot]

An anonymous reader quotes a report from The Guardian: Writing in the journal Nature Neuroscience, [researchers from the University of California, San Francisco] reveal how they developed their system by recruiting four participants who had electrode arrays implanted in their brain to monitor epileptic seizures. These participants were asked to read aloud from 50 set sentences multiple times, including "Tina Turner is a pop singer," and "Those thieves stole 30 jewels." The team tracked their neural activity while they were speaking. This data was then fed into a machine-learning algorithm, a type of artificial intelligence system that converted the brain activity data for each spoken sentence into a string of numbers. To make sure the numbers related only to aspects of speech, the system compared sounds predicted from small chunks of the brain activity data with actual recorded audio. The string of numbers was then fed into a second part of the system which converted it into a sequence of words. At first the system spat out nonsense sentences. But as the system compared each sequence of words with the sentences that were actually read aloud it improved, learning how the string of numbers related to words, and which words tend to follow each other. The team then tested the system, generating written text just from brain activity during speech. The system was not perfect, but for one participant just 3% of each sentence on average needed correcting -- "higher than the word error rate of 5% for professional human transcribers," the report says. "But, the team stress, unlike the latter, the algorithm only handles a small number of sentences." "The team also found that training the algorithm on one participant's data meant less training data was needed from the final user -- something that could make training less onerous for patients."

Read more of this story at Slashdot.


UK judge gives Google a choice: Either let SEO expert read your ranking algos or withdraw High Court evidence [The Register]

Tough choice for adtech monolith in Foundem case

Google must either show its "crown jewels" to a man it described to the High Court as a search engine optimisation expert – or give up parts of its defence in a long-running competition lawsuit, the High Court has ruled.…


Linus Torvalds Questions The Not So Glorious Driver For That Funky Looking RGB Mouse [Phoronix]

Last month I noted a new Linux driver for a buggy and funky looking mouse. A special driver was created by a community developer due to not all the mice button working otherwise due to not abiding by HID specifications. Now that the driver was merged for Linux 5.7, Linus Torvalds had words to share on this open-source driver...


Trailblazing a Development Environment for Workers [The Cloudflare Blog]

Trailblazing a Development Environment for Workers
Trailblazing a Development Environment for Workers

When I arrived at Cloudflare for an internship in the summer of 2018, I was taken on a tour, introduced to my mentor who took me out for coffee (shoutout to Preston), and given a quick whiteboard overview of how Cloudflare works. Each of the interns would work on a small project of their own and they’d try to finish them by the end of the summer. The description of the project I was given on my very first day read something along the lines of “implementing signed exchanges in a Cloudflare Worker to fix the AMP URL attribution problem,” which was a lot to take in at once. I asked so many questions those first couple of weeks. What are signed exchanges? Can I put these stickers on my laptop? What’s a Cloudflare Worker? Is there a limit to how much Topo Chico I can take from the fridge? What’s the AMP URL attribution problem? Where’s the bathroom?

I got the answers to all of those questions (and more!) and eventually landed a full-time job at Cloudflare. Here’s the story of my internship and working on the Workers Developer Experience team at Cloudflare.

Getting Started with Workers in 2018

After doing a lot of reading, and asking a lot more questions, it was time to start coding. I set up a Cloudflare account with a Workers subscription, and was greeted with a page that looked something like this:

Trailblazing a Development Environment for Workers

I was able to change the code in the text area on the left, click “Update”, and the changes would be reflected on the right — fairly self-explanatory. There was also a testing tab which allowed me to handcraft HTTP requests with different methods and custom headers. So far so good.

As my project evolved, it became clear that I needed to leave the Workers editor behind. Anything more than a one-off script tends to require JavaScript modules and multiple files. I spent some time setting up a local development environment for myself with npm and webpack (see, purgatory: a place or state of temporary suffering.

After I finally got everything working, my iteration cycle looked a bit like this:

  1. Make a change to my code
  2. Run npm run build (which ran webpack and bundled my code in a single script)
  3. Open ./dist/worker.min.js (the output from my build step)
  4. Copy the entire contents of the built Worker to my clipboard
  5. Switch to the Cloudflare Workers Dashboard
  6. Paste my script into the Workers editor
  7. Click update
  8. Investigate the behavior of my recently modified script
  9. Rinse and repeat

There were two main things here that were decidedly not a fantastic developer experience:

  1. Inspecting the value of a variable by adding a console.log statement would take me ~2-3 minutes and involved lots of manual steps to perform a full rebuild.
  2. I was unable to use familiar HTTP clients such as cURL and Postman without deploying to production. This was because the Workers Preview UI was an iframe nested in the dashboard.

Luckily for me, Cloudflare Workers deploy globally incredibly quickly, so I could push the latest iteration of my Worker, wait just a few seconds for it to go live, and cURL away.

A Better Workers Developer Experience in 2019

Shortly after we shipped AMP Real URL, Cloudflare released Wrangler, the official CLI tool for developing Workers, and I was hired full time to work on it. Wrangler came with a feature that automated steps 2-7 of my workflow by running the command wrangler preview, which was a significant improvement. Running the command would build my Worker and open the browser automatically for me so I could see log messages and test out HTTP requests. That summer, our intern Matt Alonso created wrangler preview --watch. This command automatically updates the Workers preview window when changes are made to your code. You can read more about that here. This was, yet again, another improvement over my old friend Build and Open and Copy and Switch Windows and Paste Forever and Ever, Amen. But there was still no way that I could test my Worker with any HTTP client I wanted without deploying to production — I was still locked in to using the nested iframe.

A few months ago we decided it was time to do something about it. To the whiteboard!

Enter wrangler dev

Most web developers are familiar with developing their applications on localhost, and since Wrangler is written in Rust, it means we could start up a server on localhost that would handle requests to a Worker. The idea was to somehow start a server on localhost and then transform incoming requests and send them off to a preview session running on a Cloudflare server.

Proof of Concept

What we came up with ended up looking a little something like this — when a developer runs wrangler dev, do the following:

Trailblazing a Development Environment for Workers
  1. Build the Worker
  2. Upload the Worker via the Cloudflare API as a previewable Worker
  3. The Cloudflare API takes the uploaded script and creates a preview session, and returns an access token
  4. Start listening for incoming HTTP requests at localhost:8787

Top secret fact: 8787 spells out Rust on a phone numpad Happy Easter!

  1. All incoming requests to localhost:8787 are modified:
  • All headers are prepended with cf-ew-raw- (for instance, X-Auth-Header would become cf-ew-raw-X-Auth-Header)
  • The URL is changed to${path}
  • The Host header is changed to
  • The cf-ew-preview header is added with the access token returned from the API in step 3
  1. After sending this request, the response is modified
  • All headers not prefixed with cf-ew-raw- are discarded and headers with the prefix have it removed (for instance, cf-ew-raw-X-Auth-Success would become X-Auth-Success)

The hard part here was already done — the Workers Core team had already implemented the API to support the Preview UI. We just needed to gently nudge Wrangler and the API to be the best of friends. After some investigation into Rust’s HTTP ecosystem, we settled on using the HTTP library hyper, which I highly recommend if you’re in need of a low level HTTP library — it’s fast, correct, and the ergonomics are constantly improving. After a bit of work, we got a prototype working and carved Wrangler ❤️ Cloudflare API into the old oak tree down by Lady Bird Lake.


Let’s say I have a Workers script that looks like this:

addEventListener('fetch', event => {

async function handleRequest(request) {
  let message = "Hello, World!"
  return new Response(message)

If I created a Wrangler project with this code and ran wrangler dev, this is what it looked like:

$ wrangler dev
👂  Listening on

In another terminal session, I could run the following:

$ curl localhost:8787
Hello, World!

It worked! Hooray!

Just the Right Amount of Scope Creep

At this point, our initial goal was complete: any HTTP client could test out a Worker before it was deployed. However, wrangler dev was still missing crucial functionality. When running wrangler preview, it’s possible to view console.log output in the browser editor. This is incredibly useful for debugging Workers applications, and something with a name like wrangler dev should include a way to view those logs as well. “This will be easy,” I said, not yet knowing what I was signing up for. Buckle up!

console.log, V8, and the Chrome Devtools Protocol, Oh My!

My first goal was to get a Hello, World! message streamed to my terminal session so that developers can debug their applications using wrangler dev. Let’s take the script from earlier and add a console.log statement to it:

addEventListener('fetch', event => {

async function handleRequest(request) {
  let message = "Hello, World!"
  console.log(message) // this line is new
  return new Response(message)

If you’d like to follow along, you can paste that script into the editor at using Google Chrome.

This is what the Preview editor looks like when that script is run:

Trailblazing a Development Environment for Workers

You can see that Hello, World! has been printed to the console. This may not be the most useful example, but in more complex applications logging different variables is helpful for debugging. If you’re following along, try changing console.log(message) to something more interesting, like console.log(request.url).

The console may look familiar to you if you’re a web developer because it’s the same interface you see when you open the Developer Tools in Google Chrome. Since Cloudflare Workers is built on top of V8 (more info about that here and here), the Workers runtime is able to create a WebSocket that speaks the Chrome Devtools Protocol. This protocol allows the client (your browser, Wrangler, or anything else that supports WebSockets) to send and receive messages that contain information about the script that is running.

In order to see the messages that are being sent back and forth between our browser and the Workers runtime:

  1. Open Chrome Devtools
  2. Click the Network tab at the top of the inspector
  3. Click the filter icon underneath the Network tab (it looks like a funnel and is nested between the cancel icon and the search icon)
  4. Click WS to filter out all requests but WebSocket connections

Your inspector should look like this:

Trailblazing a Development Environment for Workers

Then, reload the page, and select the /inspect item to view its messages. It should look like this:

Trailblazing a Development Environment for Workers

Hey look at that! We can see messages that our browser sent to the Workers runtime to enable different portions of the developer tools for this Worker, and we can see that the runtime sent back our Hello, World! Pretty cool!

On the Wrangler side of things, all we had to do to get started was initialize a WebSocket connection for the current Worker, and send a message with the method Runtime.enable so the Workers runtime would enable the Runtime domain and start sending console.log messages from our script.

After those initial steps, it quickly became clear that a lot more work was needed to get to a useful developer tool. There’s a lot that goes into the Chrome Devtools Inspector and most of the libraries for interacting with it are written in languages other than Rust (which we use for Wrangler). We spent a lot of time switching WebSocket libraries due to incompatibilities across operating systems (turns out TLS is hard) and implementing the part of the Chrome Devtools Protocol in Rust that we needed to. There’s a lot of work that still needs to be done in order to make wrangler dev a top notch developer tool, but we wanted to get it into the hands of developers as quickly as possible.

Try it Out!

wrangler dev is currently in alpha, and we’d love it if you could try it out! You should first check out the Quick Start and then move on to wrangler dev. If you run into issues or have any feedback, please let us know!

Signing Off

I’ve come a long way from where I started in 2018 and so has the Workers ecosystem. It’s been awesome helping to improve the developer experience of Workers for future interns, internal Cloudflare teams, and of course our customers. I can’t wait to see what we do next. I have some ideas for what’s next with Wrangler, so stay posted!

P.S. Wrangler is also open source, and we are more than happy to field bug reports, feedback, and community PRs. Check out our Contribution Guide if you want to help out!


Need a new IT role? These organizations are hiring engineers, leaders, analysts – see inside for more details [The Register]

Our free job ad offer continues and vacancies keep rolling in

Job Alert  Welcome to this week's jobs list, a rundown of vacancies El Reg is advertising for free to keep tech people in work amid the coronavirus pandemic.…


Nikon Is Streaming Online Photography Courses For Free This Month [Slashdot]

Nikon USA is offering 10 classes from the its online school for free during the month of April. Engadget reports: The courses range in length from 15 minutes to well over an hour, and all are taught by pro photographers and often Nikon ambassadors. Each class runs between $15 and $50, so Nikon is offering $250 worth of photography training for free. The courses run a wide gamut from landscape photography, macro photography, fundamentals by Reed Hoffman and even "The Art of Making Music Videos" with Chris Hershman. Several others are camera-specific, like a Z50 video course from Kitty Peters and a hands-on course with Nikon's SB-5000 speedlight. You do have to give Nikon your name and address, but the value of the courses is easily worth that -- to check them out, go here.

Read more of this story at Slashdot.


POCL 1.5 Released With Performance Improvements, Fixes For OpenCL On CPUs [Phoronix]

POCL 1.5 has been released as the "Portable CL" implementation for running OpenCL on CPUs and other devices with LLVM back-ends...


Tech services biz Allvotec furloughing staff, asking remainder – including top brass – to take pay cut [The Register]

CEO talks of measures to combat expected sales slide due to pandemic

Allvotec – the rebranded Daisy Partner Services business – is responding to the coronavirus crisis by furloughing a number of staff and asking all that remain to take a pay cut to avoid potential redundancies.…


Intel MKL-DNN / DNNL 1.3 Released With Cooper Lake Optimizations [Phoronix]

Intel on Thursday released version 1.3 of their Deep Neural Network Library (DNNL) formerly known as MKL-DNN in offering a open-source performance library for deep learning applications...


Windows spotted flashing its unmentionables in a Chicago clothier [The Register]

This season's colours are blue, white and bork

Bork!Bork!Bork!  Chicago! A town famed for what some might regard as a jumped-up quiche masquerading as pizza and home of the first skyscraper. Could there be a better venue for today's bork?…


Take back your dotfiles with Chezmoi [Fedora Magazine]

In Linux, dotfiles are hidden text files that are used to store various configuration settings for many such as Bash and Git to more complex applications like i3 or VSCode.

Most of these files are contained in the ~/.config directory or right in the home directory. Editing these files allows you to customize applications beyond what a settings menu may provide, and they tend to be portable across devices and even other Linux distributions. But one talking point across the Linux enthusiast community is how to manage these dotfiles and how to share them.

We will be showcasing a tool called Chezmoi that does this task a little differently from the others.

The history of dotfile management

If you search GitHub for dotfiles, what you will see are over 100k repositories after one goal: Store people’s dotfiles in a shareable and repeatable manor. However, other than using git, they store their files differently.

While Git has solved code management problems that also translates to config file management, It does not solve how to separate between distributions, roles (such as home vs work computers) secrets management, and per device configuration.

Because of this, many users decide to craft their own solutions, and the community has responded with multiple answers over the years. This article will briefly cover some of the solutions that have been created.

Experiment in an isolated environment

Do you want to try these below solutions quickly in a contained environment? Run:

$ podman run --rm -it fedora

… to create a Fedora container to try the applications in. This container will automatically delete itself when you exit the shell.

The install problem

If you store your dotfiles in Git repository, you will want to make it easy for your changes to automatically be applied inside your home directory, the easiest way to do this at first glance is to use a symlink, such as ln -s ~/.dotfies/bashrc ~/.bashrc. This will allow your changes to take place instantly when your repository is updated.

The problem with symlinks is that managing symlinks can be a chore. Stow and RCM (covered here on Fedora Magazine) can help you manage those, but these are not seamless solutions. Files that are private will need to be modified and chmoded properly after download. If you revamp your dotfiles on one system, and download your repository to another system, you may get conflicts and require troubleshooting.

Another solution to this problem is writing your own install script. This is the most flexible option, but has the tradeoff of requiring more time into building a custom solution.

The secrets problem

Git is designed to track changes. If you store a secret such as a password or an API key in your git repository, you will have a difficult time and will need to rewrite your git history to remove that secret. If your repository is public, your secret would be impossible to recover if someone else has downloaded your repository. This problem alone will prevent many individuals from sharing their dotfiles with the public world.

The multi-device config problem

The problem is not pulling your config to multiple devices, the problem is when you have multiple devices that require different configuration. Most individuals handle this by either having different folders or by using different forks. This makes it difficult to share configs across the different devices and role sets

How Chezmoi works

Chezmoi is a tool to manage your dotfiles with the above problems in mind, it doesn’t blindly copy or symlink files from your repository. Chezmoi acts more like a template engine to generate your dotfiles based on system variables, templates, secret managers, and Chezmoi’s own config file.

Getting Started with Chezmoi

Currently Chezmoi is not in the default repositories. You can download the current version of Chezmoi as of writing with the following command.

$ sudo dnf install

This will install the pre-packaged RPM to your system.

Lets go ahead and create your repository using:

$ chezmoi init

It will create your new repository in ~/.local/share/chezmoi/. You can easily cd to this directory by using:

$ chezmoi cd

Lets add our first file:

chezmoi add ~/.bashrc 

… to add your bashrc file to your chezmoi repository.

Note: if your bashrc file is actually a symlink, you will need to add the -f flag to follow it and read the contents of the real file.

You can now edit this file using:

$ chezmoi edit ~/.bashrc

Now lets add a private file, This is a file that has the permissions 600 or similar. I have a file at .ssh/config that I would like to add by using

$ chezmoi add ~/.ssh/config

Chezmoi uses special prefixes to keep track of what is a hidden file and a private file to work around Git’s limitations. Run the following command to see it:

$ chezmoi cd

Do note that files that are marked as private are not actually private, they are still saved as plain text in your git repo. More on that later.

You can apply any changes by using:

$ chezmoi apply

and inspect what is different by using

$ chezmoi diff

Using variables and templates

To export all of your data Chezmoi can gather, run:

$ chezmoi data

Most of these are information about your username, arch, hostname, os type and os name. But you can also add our own variables.

Go ahead and run:

$ chezmoi edit-config

… and input the following:

         email = ""
         name = "Fedora Mcdora"

Save your file and run chezmoi data again. You will see on the bottom that your email and name are now added. You can now use these with templates with Chezmoi. Run:

$ chezmoi add  -T --autotemplate ~/.gitconfig

… to add your gitconfig as a template into Chezmoi. If Chezmoi is successful in inferring template correctly, you could get the following:

         email = "{{ .email }}"
         name = "{{ .name }}"

If it does not, you can change the file to this instead.

Inspect your file with:

$ chezmoi edit ~/.gitconfig

After using

$ chezmoi cat ~/.gitconfig

… to see what chezmoi will generate for this file. My generated example is below:

[root@a6e273a8d010 ~]# chezmoi cat ~/.gitconfig 
     email = ""
     name = "Fedora Mcdora"
 [root@a6e273a8d010 ~]# 

It will generate a file filled with the variables in our chezmoi config.
You can also use the varibles to perform simple logic statements. One example is:

{{- if eq .chezmoi.hostname "fsteel" }}
# this will only be included if the host name is equal to "fsteel"
{{- end }}

Do note that for this to work the file has to be a template. You can check this by seeing if the file has a “.tmpl” appended to its name on the file in chezmoi cd, or by readding the file using the -T option

Keeping secrets… secret

To troubleshoot your setup, use the following command.

$ chezmoi doctor 

What is important here is that it also shows you the password managers it supports.

[root@a6e273a8d010 ~]# chezmoi doctor
 warning: version dev
      ok: runtime.GOOS linux, runtime.GOARCH amd64
      ok: /root/.local/share/chezmoi (source directory, perm 700)
      ok: /root (destination directory, perm 550)
      ok: /root/.config/chezmoi/chezmoi.toml (configuration file)
      ok: /bin/bash (shell)
      ok: /usr/bin/vi (editor)
 warning: vimdiff (merge command, not found)
      ok: /usr/bin/git (source VCS command, version 2.25.1)
      ok: /usr/bin/gpg (GnuPG, version 2.2.18)
 warning: op (1Password CLI, not found)
 warning: bw (Bitwarden CLI, not found)
 warning: gopass (gopass CLI, not found)
 warning: keepassxc-cli (KeePassXC CLI, not found)
 warning: lpass (LastPass CLI, not found)
 warning: pass (pass CLI, not found)
 warning: vault (Vault CLI, not found)
 [root@a6e273a8d010 ~]# 

You can use either of these clients, or a generic client, or your system’s Keyring.

For GPG, you will need to add the following to your config using:

$ chezmoi edit-config
   recipient = "<Your GPG keys Recipient"

You can use:

$ chezmoi add --encrypt

… to add any files, these will be encrypted in your source respository and not exposed to the public world as plain text. Chezmoi will automatically decrypt them when applying.

We can also use them in templates. For example, a secret token stored in Pass (covered on Fedora Magazine). Go ahead and generate your secret.

In this example, it’s called “githubtoken”:

rwaltr@fsteel:~] $ pass ls
 Password Store
 └── githubtoken
 [rwaltr@fsteel:~] $ 

Next, edit your template, such as your .gitconfig we created earlier and add this lines.

token = {{ pass "githubtoken" }}

Then lets inspect using:

$ chezmoi cat ~/.gitconfig
[rwaltr@fsteel:~] $ chezmoi cat ~/.gitconfig 
 This is Git's per-user configuration file.
           name = Ryan Walter
           email =
           token = mysecrettoken
 [rwaltr@fsteel:~] $ 

Now your secrets are properly secured in your password manager, your config can be publicly shared without risk!

Final notes

This is only scratching the surface. Please check out Chezmoi’s website for more information. The author also has his dotfiles public if you are looking for more examples on how to use Chezmoi.


Zoom vows to spend next 90 days thinking hard about its security and privacy after rough week, meeting ID war-dialing tool emerges [The Register]

Passwords-by-default feature may be faulty. But hey, who else just went from 10 to 200 million daily users?

Video-conferencing app maker Zoom has promised to do better at security after a bruising week in which it was found to be unpleasantly leaky in several ways.…


Absolutely everyone loves video conferencing these days. Some perhaps a bit too much [The Register]

Saving Sales from a self-inflicted dirty deed

On Call  Phew, March is over. Everything will be OK now, right? Right? Oh well... join us in nervously welcoming April with another tale from that special breed tasked with answering the phone, even when the subject matter is perhaps less than savoury.…


Modern Meteorology Was Born 60 Years Ago Today [Slashdot]

"Sixty years ago on this date, April 1, a Thor-Able rocket launched a small satellite weighing 122.5kg into an orbit about 650km above the Earth's surface," writes Ars Technica's Eric Berger. "Effectively, this launch from Florida's Cape Canaveral Air Force Station marked the beginning of the era of modern weather forecasting." From the report: Designed by the Radio Corporation of America and put into space by NASA, the Television InfraRed Observation Satellite, or TIROS-1, was the nation's first weather satellite. During its 78 days of operation, TIROS-1 successfully monitored Earth's cloud cover and weather patterns from space. This was a potent moment for the field of meteorology. For the first time, scientists were able to combine space-based observations with physical models of the atmosphere that were just beginning to be run on supercomputers. After World War II, mathematician John von Neumann led development of a computer to crunch through a set of equations put together by Jule Charney and other scientists. By the mid-1950s, Charney's group began to produce numerical forecasts on a regular basis. All of a sudden, meteorologists had two incredibly useful tools at their hands. Of course, it would take time for more powerful computers to produce higher-resolution forecasts, and the sensor technology launched on satellites would require decades to improve to the point where spacecraft could collect data for temperature, moisture, and other environmental variables at various levels in the atmosphere. But by around 1980, the tools of satellite observations and numerical models that could process that data started to mature. Scientists had global satellite coverage, 24 hours a day, and forecasts began to improve dramatically. Today, the fifth day of a five-day forecast on the app on your phone is about as accurate as the next day's forecast was in 1980.

Read more of this story at Slashdot.


Cricket's average-busting mathematician Tony Lewis pulls up stumps [The Register]

University lecturer and half of Duckworth-Lewis passes, aged 78

Eminent British mathematician Tony Lewis has died, aged 78.…


Plenty Of New Sound Hardware Support, Continued Sound Open Firmware Work For Linux 5.7 [Phoronix]

SUSE's Takashi Iwai who oversees the sound subsystem for the Linux kernel sent in his changes on Thursday that are ready for the 5.7 kernel...


Automatic for the People: Pandemic-fueled rush to robo-moderation will be disastrous – there must be oversight [The Register]

EFF raises alarm over increasing reliance on shoddy automation

Analysis  The Electronic Frontier Foundation on Thursday warned that the consequences of the novel coronavirus pandemic – staff cuts, budget cuts, and lack of access to on-site content review systems, among others – have led tech companies to focus even more resources on barely functional moderation systems.…

Thursday, 02 April


Intel's 10th-gen Core family cracks 5GHz barrier with H-series laptop processors [The Register]

New line-up includes first i9 part in this latest generation

Intel has announced its tenth-generation Core i5, i7, and i9 H-series microprocessors for laptops, which max out at 5.3GHz.…


Chrome, Skype, Microsoft Teams, Zoom, VSCode Now Unofficially Available For Clear Linux [Phoronix]

One of the common criticisms for those trying to use Clear Linux on the desktop is that it lacks easy access to proprietary packages like Google Chrome and Steam. There has been plumbing within its swupd package/bundle management system to support third-party repositories to expand the ecosystem and now we're finally seeing that happen...


Salesforce publishes self-themed activity book to keep your kids ‘Appy [The Register]

A new use for App Exchange, with that slightly greasy Ronald McDonald vibe

Salesforce has decided to offer some help to parents who are trying to balance working from home with keeping kids entertained.…


'Call of Duty' Wins First Amendment Victory Over Use of Humvees [Slashdot]

An anonymous reader quotes a report from The Hollywood Reporter: Call of Duty maker Activision has prevailed in a closely watched trademark dispute brought by AM General, the government contractor for Humvees. On Tuesday, a New York federal judge responded favorably to Activision's argument that it had a First Amendment right to depict contemporary warfare in its game by featuring Humvees. "If realism is an artistic goal, then the presence in modern warfare games of vehicles employed by actual militaries undoubtedly furthers that goal," writes U.S. District Court Judge George B. Daniels in granting summary judgment in favor of Activision. The video game publisher fought AM General's claims along with Major League Gaming Corp., a professional esports organization. The dispute was potentially worth tens of millions of dollars, and the discussion attracted intellectual property professors and the Electronic Software Association to weigh in with amicus briefs. You can read the full opinion here.

Read more of this story at Slashdot.


Philippines considers app to trace coronavirus carriers [The Register]

Privacy perspective: President has also threatened quarantine-breaking troublemakers may be shot

The Philippines has started planning an app to help the government track the movements and contacts of people who carry the novel coronavirus.…


Australian digital-radio-for-railways Huawei project derailed by US trade sanctions against Chinese tech giant [The Register]

Uncle Sam's crackdown sparks 'force majeure event that cannot be overcome'

One of Huawei’s flagship projects in Australia has been called off because, as the state government put it, “trade restrictions imposed by the US government create a force majeure event that cannot be overcome.”…


Twitter Discloses Firefox Bug That Cached Private Files Sent or Received via DMs [Slashdot]

Social networking giant Twitter today disclosed a bug on its platform that impacted users who accessed their platform using Firefox browsers. From a report: According to Twitter, its platform stored private files inside the Firefox browser's cache -- a folder where websites store information and files temporarily. Twitter said that once users left their platform or logged off, the files would remain in the browser cache, allowing anyone to retrieve it. The company is now warning users who share workstations or used a public computer that some of their private files may still be present in the Firefox cache. Malware present on a system could also scrape and steal this data, if ever configured to do so.

Read more of this story at Slashdot.


NASA's classic worm logo returns for first all-American trip to ISS in years: Are you a meatball or a squiggly fan? [The Register]

Should boffins keep it old-school space-race era – or embrace the, er, future of the 1970s?

Poll  NASA has brought back its sleek iconic logo, lovingly named the worm for its curvy red font, in time for its first crewed spaceflight using American rockets in almost a decade.…


If you use Twitter with Firefox in a shared computer account, you may have slightly spilled some private data on that PC [The Register]

HTTP header ends in own goal

Twitter on Thursday warned of an esoteric bug that, in limited circumstances, allowed users' non-public profile information to potentially fall into the hands of other users.…


NetBSD 8.2 Released With Fix For Ryzen USB Issues, Fix For Booting Single Core CPUs [Phoronix]

While NetBSD 9.0 has been out since mid-February, for those still on the NetBSD 8 series the NetBSD 8.2 milestone is now available with various fixes. As a result of the coronavirus, the NetBSD 7 series is also being extended...


Linux 5.7 Seeing Updates For Intel SpeedSelect Technology, Jasper Lake PMC [Phoronix]

Andy Shevchenko submitted on Tuesday the x86 platform driver updates targeting the Linux 5.7 kernel merge window...


Why is ransomware still a thing? One-in-three polled netizens say they would cave to extortion demands [The Register]

American young adults are easiest marks for criminals, study reckons

Want to know why ransomware is still rampant? One in three surveyed folks in North Americans said they would be willing to pay up to unscramble their files once their personal systems were infected.…


Google changes course, proposes proprietary in-app purchase API as web standard [The Register]

Developers rejoice, there may be (eventually) a store-agnostic way to sell in-application items

Google has decided to try to openly standardize one of its own APIs for handling in-app purchases in web apps rather than pursuing a previously proposed proprietary plan.…


US prez Trump's administration reportedly nears new rules banning 'dual-use' tech sales to China [The Register]

Non-military tech rule exception plus other tweaks mulled

The US government is reportedly close to introducing stringent new rules that would stop Chinese companies from buying certain high-technology components, including semiconductors and optical materials.…


System76 Thelio Major Proves To Be A Major Player For Linux Workstations [Phoronix]

For the past two months we have been testing the System76 Thelio Major and it's been working out extremely well with performance and reliability. The Thelio Major offering with options for Intel Core X-Series or AMD Ryzen Threadripper and resides between their standard Thelio desktop with Ryzen/Core CPUs and the Thelio Massive that sports dual Intel Xeon CPUs.


X-Plane 11.50 Flight Simulator Beta Released With Vulkan API Support [Phoronix]

For years we have been looking forward to X-Plane with a new Vulkan renderer to replace its aging OpenGL renderer. Finally today the X-Plane 11.50 Beta has been made public for this realistic flight simulator that supports Metal on Apple platforms and Vulkan everywhere else...


Tech tracker Tile testifies in Congress: Apple's geolocation nagging is so not fair [The Register]

Alleges anticompetitive behaviour in the walled garden. There's no party like a third party, eh?

Channeling their inner Kevin Patterson, Tile this week bemoaned Apple's unfairness to a US congressional panel in Colorado investigating the iPhone maker's stewardship of its app ecosystem.…


Maintain business continuity in these challenging times with the Akamai Edge Live Virtual Summit 2020 [The Register]

Get the latest advice and insight on scale, resiliency and intelligence

Promo  Business continuity is another of those aspects of IT that, before, was always pitched on a rather "what if" basis. What if your global cloud comms provider goes down? What if a squirrel chews through one of your server room’s insulation cables?…


Saturday Morning Breakfast Cereal - Pu [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Blessings upon every tweeter who contributed to this solution. I decided not to include cladding because cladding is for losers.

Today's News:

If I get at least 4 people offering one million dollars a piece I'll run a kickstarter for this with FREE bookmarks.


GNOME 3.36.1 Released With First Batch Of Fixes [Phoronix]

Following last month's release of GNOME 3.36 with its many new features and performance improvements, GNOME 3.36.1 is out today with the first batch of updates/fixes to this H1'2020 open-source desktop...


Huawei P40 pricing is in step with previous P-series efforts – but flagship lacks the apps punters have come to expect [The Register]

£899 and no social media? That's going to be a big ask

There are few surprises around UK pricing and availability for Huawei's latest P40 handsets, which are more or less consistent with previous P-series models – but they are missing a few things punters may not be too, er, appy about.…


Mesa OpenGL Threading Enabled For More Games Yielding Sizable Performance Jumps [Phoronix]

Well known open-source AMD OpenGL driver developer Marek Olšák has enabled more Linux games to run with Mesa's GLTHREAD functionality enabled for helping with the performance...


Boeing 787s must be turned off and on every 51 days to prevent 'misleading data' being shown to pilots [The Register]

US air safety bods call it 'potentially catastrophic' if reboot directive not implemented

The US Federal Aviation Administration has ordered Boeing 787 operators to switch their aircraft off and on every 51 days to prevent what it called "several potentially catastrophic failure scenarios" – including the crashing of onboard network switches.…


Rethinking VPN: Tailscale startup packages Wireguard with network security [The Register]

'A whole bunch of tunnels': Mesh networking with per-node permissions and OAuth security

Interview  WireGuard, a new VPN protocol with both strong performance and easy setup, has been adopted by startup Tailscale as the basis of a peer-to-peer remote networking system that is both secure and quick to configure.…


Cloudflare Doubling Size of 2020 Summer Intern Class [The Cloudflare Blog]

Cloudflare Doubling Size of 2020 Summer Intern Class
Cloudflare Doubling Size of 2020 Summer Intern Class

We are living through extraordinary times. Around the world, the Coronavirus has caused disruptions to nearly everyone's work and personal lives. It's been especially hard to watch as friends and colleagues outside Cloudflare are losing jobs and businesses struggle through this crisis.

We have been extremely fortunate at Cloudflare. The super heroes of this crisis are clearly the medical professionals at the front lines saving people's lives and the scientists searching for a cure. But the faithful sidekick that's helping us get through this crisis — still connected to our friends, loved ones, and, for those of us fortunate enough to be able to continue work from home, our jobs — is the Internet. As we all need it more than ever, we're proud of our role in helping ensure that the Internet continues to work securely and reliably for all our customers.

We plan to invest through this crisis. We are continuing to hire across all teams at Cloudflare and do not foresee any need for layoffs. I appreciate the flexibility of our team and new hires to adapt what was our well-oiled, in-person orientation process to something virtual we're continuing to refine weekly as new people join us.

Summer Internships

One group that has been significantly impacted by this crisis are students who were expecting internships over the summer. Many are, unfortunately, getting notice that the experiences they were counting on have been cancelled. These internships are not only a significant part of these students' education, but in many cases provide an income that helps them get through the school year.

Cloudflare is not cancelling any of our summer internships. We anticipate that many of our internships will need to be remote to comply with public health recommendations around travel and social distancing. We also understand that some students may prefer a remote internship even if we do begin to return to the office so they can take care of their families and avoid travel during this time. We stand by every internship offer we have extended and are committed to making each internship a terrific experience whether remote, in person, or some mix of both.

Doubling the Size of the 2020 Internship Class

But, seeing how many great students were losing their internships at other companies, we wanted to do more. Today we are announcing that we will double the size of Cloudflare’s summer 2020 internship class. Most of the internships we offer are in our product, security, research and engineering organizations, but we also have some positions in our marketing and legal teams. We are reopening the internship application process and are committed to making decisions quickly so students can plan their summers. You can find newly open internships posted here.

Internships are jobs, and we believe people should be paid for the jobs they do, so every internship at Cloudflare is paid. That doesn't change with these new internship positions we're creating: they will all be paid.

Highlighting Other Companies with Opportunities

Even when we double the size of our internship class we expect that we will receive far more qualified applicants than we will be able to accommodate. We hope that other companies that are in a fortunate position to be able to weather this crisis will consider expanding their internship classes as well. We plan to work with peer organizations and will highlight those that also have summer internship openings. If your company still has available internship positions, please let us know by emailing so we can point students your way:

Opportunity During Crisis

Cloudflare was born out of a time of crisis. Michelle and I were in school when the global financial crisis hit in 2008. Michelle had spent that summer at an internship at Google. That was the one year Google decided to extend no full-time offers to summer interns. So, in the spring of 2009, we were both still trying to figure out what we were going to do after school.

It didn't feel great at the time, but had we not been in the midst of that crisis I'm not sure we ever would have started Cloudflare. Michelle and I remember the stress of that time very clearly. The recognition of the importance of planning for rainy days has been part of what has made Cloudflare so resilient. And it's why, when we realized we could play a small part in ensuring some students who had lost the internships they thought they had could still have a rewarding experience, we knew it was the right decision.

Together, we can get through this. And, when we do, we will all be stronger.

Apply here.

Or visit and type Internship into the search box to see a complete list of all internship options.

LXD 4.0 LTS Released For Offering The Latest Linux Containers Experience [Phoronix]

Ahead of the Ubuntu 20.04 LTS release later this month, the Canonical folks working on LXD for Linux containers and VMs have released LXD 4.0 LTS...


Btrfs File-System Updates Land In Linux 5.7 [Phoronix]

SUSE's David Sterba sent in the Btrfs file-system updates this week for the Linux 5.7 kernel...


ZX Spectrum prototype ROM is now available for download courtesy of boffins at the UK's Centre for Computing History [The Register]

Lose yourself in a relic of simpler times

Got some unexpected time on your hands and a yearning for simpler times? May we present an original prototype ROM of the Sinclair ZX Spectrum, courtesy of The Centre for Computing History, for your tinkering pleasure.…


VMware plans to give vSphere power to automatically patch everything running in a VM [The Register]

But first, you get a taste of it with host lifecycle management, and more waiting for K8s integration

VMware plans to give its flagship vSphere product the power to patch all the software inside a virtual machine.…


Soup to nuts? Not quite for SAP asset management as oil drilling firm employs supply-chain add-on to save $6.8m [The Register]

That's a pretty expensive gap in S4/HANA

When Precision Drilling, a global company that serves the oil and gas industry, upgraded its asset management application to SAP S/4HANA two years ago, it was expecting the software to handle data going into its supply chain systems. But it soon discovered that was not the case.…


LLVM Plumbs Support For Intel Golden Cove's New SERIALIZE Instruction [Phoronix]

Yesterday we noted Intel's programming reference manual being updated with new Golden Cove instructions for Sapphire Rapids and Alder Lake and with that Intel's open-source developers have begun pushing their changes to the compilers. The latest updates add TSXLDTRK, a new HYBRID bit for Core+Atom hybrid CPUs, and a new SERIALIZE instruction. After GCC was receiving the patch attention yesterday, LLVM is getting its attention today...


Linux 5.6.2 Released With Fix For The IWLWIFI Intel WiFi Driver [Phoronix]

Basically a half-week after Linux 5.6 shipped as stable, we are up to the second point release of it...


Do you want to be an astronaut when you grow up? Yeah, you and 12,000 others: NASA flooded with folks hoping to visit Moon, Mars [The Register]

Cramped conditions? Not being able to see friends and family? Sounds familiar

If the current coronavirus pandemic has got you wanting to leave Planet Earth, you’re not alone. More than 12,000 people answered NASA’s latest call for astronauts to explore the Moon and Mars.…


Intel 10th Gen H-Series Mobile CPUs Hit Up To 5.3GHz [Phoronix]

Days after AMD announced their full Ryzen 4000 series mobile CPU line-up, Intel has now introduced their 10th Gen Core H-series processors...


Huawei signs non-aggression patent pact with membership of Open Invention Network [The Register]

Chinese giant plays nice with open source

Updated  Huawei has become a licensee member of the Open Invention Network (OIN), which agrees to cross-license Linux patents to one another royalty free and to any organisation that agrees not to assert its patents against Linux.…


Fitbit unfurls last new wearable before it's gobbled by Google, right on time for global pandemic lockdown [The Register]

Track your jogs around the block – while those are still allowed

The gyms are shut. The government wants you to stay indoors. You can only leave your home once per day. And nobody, it seems, told Fitbit, which this week announced its latest calorie-counting, timber-trimming wearable – the Fitbit Charge 4.…


GCC 10 Release Candidate Likely Hitting In The Next Few Weeks [Phoronix]

The month of April usually sees the new annual GNU Compiler Collection (GCC) feature releases and for GCC 10 in the form of GCC 10.1 as the first stable release in the series does stand chances of releasing this month...


Here's what Europeans are buying amid the COVID-19 lockdown – aside from heaps of pasta and toilet paper [The Register]

Clue: Home working might have played a bit part, even in return of the desktop

The rush to work from home as COVID-19 grips Europe has led to bumper sales of related tech for distributors, official stats confirm.…


Google Cloud Engine outage caused by 'large backlog of queued mutations' [The Register]

Ad giant added memory to servers, restarted, watched things get worse ... is on top of things again now

A 14-hour Google cloud platform outage that we missed in the shadow of last week's G Suite outage was caused by a failure to scale, an internal investigation has shown.…


Slack hooks up with Microsoft Teams and Zoom VoIP calls [The Register]

Maybe the world needs the Pidgin of voice and video right now?

Slack has added integrations with Microsoft Teams and Zoom calls.…


If you thought black holes only came in S or XXXL, guess again, maybe: Elusive mid-mass void spotted eating star [The Register]

Could this candidate be a missing link between small and bonkers-massive?

Astronomers have discovered what they believe could be a black hole of intermediate-mass nestled on the outskirts of a large galaxy more than 700 million light years away.…


Cisco rations VPNs for staff as strain of 100,000+ home workers hits its network [The Register]

Following the moon to find capacity and sticking to safe sites as Chuck Robbins leads new weekly briefings

Cisco was surprised by how quickly it needed to adopt a global working-from-home policy, amid the coronavirus pandemic, and is now rationing VPN use to safeguard its security.…


Does the US CLOUD Act hang darkly over your data privacy? [The Register]

How to combat the threat and comply with GDPR

Webcast  Here’s something that you may not know, something the cloud companies are not keen to shout about too loudly.…

Wednesday, 01 April


SELinux Seeing Performance Improvements With Linux 5.7 [Phoronix]

A few months back when we last looked at the performance impact of having SELinux enabled there was a hit but not too bad for most workloads. But we'll need to take another look soon as with the Linux 5.7 kernel are some performance improvements and more for SELinux...


Well, 2019 finished with Intel as king of the chip world, Broadcom doing OK, everyone else shrinking. Good thing 2020's looking up, eh? [The Register]

Oh, oh no... oh God

Intel and Broadcom were the lone beacons of success in an otherwise dismal semiconductor market last year, according to industry analysts at Omdia (formerly IHS Markit).…


Mesa 20.0.3 Released With Latest Open-Source Graphics Driver Fixes [Phoronix]

While many of you are users of Mesa Git for experiencing the bleeding-edge graphics drivers especially if you are a gamer wanting peak performance, for those on the Mesa stable series the Mesa 20.0.3 update has now shipped...


Vietnam bans posting fake news online [The Register]

About coronavirus or anything else

Vietnam will fine people posting fake news on social media in an effort to crack down on the spread of both general misinformation and falsehoods about the novel coronavirus.…


The Mistake that Caused to Block LGBTQIA+ Sites Today [The Cloudflare Blog]

The Mistake that Caused to Block LGBTQIA+ Sites Today

Today we made a mistake. The mistake caused a number of LGBTQIA+ sites to inadvertently be blocked by the new for Families service. I wanted to walk through what happened, why, and what we've done to fix it.

As is our tradition for the last three years, we roll out new products for the general public that uses the Internet on April 1. This year, one of those products was a filtered DNS service, for Families. The service allows anyone who chooses to use it to restrict certain categories of sites.

Filtered vs Unfiltered DNS

Nothing about our new filtered DNS service changes the unfiltered nature of our original service. However, we recognized that some people want a way to control what content is in their home. For instance, I block social media sites from resolving while I am trying to get work done because it makes me more productive. The number one request from users of was that we create a version of the service for home use to block certain categories of sites. And so, earlier today, we launched for Families.

Over time, we'll provide the ability for users of for Families to customize exactly what categories they block (e.g., do what I do with social media sites to stay productive). But, initially, we created two default settings that were the most requested types of content people wanted to block: Malware (which you can block by setting and as your DNS resolvers) and Malware + Adult Content (which you can block by setting and as your DNS resolvers).

Licensed Categorization Data

To get data for for Families  we licensed feeds from multiple different providers who specialize in site categorization. We spent the last several months reviewing classification providers to choose the ones that had the highest accuracy and lowest false positives.

Malware, encompassing a range of widely agreed upon cyber security threats, was the easier of the two categories to define. For Adult Content, we aimed to mirror the Google SafeSearch criteria. Google has been thoughtful in this area and their SafeSearch tool is designed to limit search results for "sexually explicit content." The definition is focused on pornography and largely follows the requirements of the US Children's Internet Protection Act (CIPA), which schools and libraries in the United States are required to follow.

Because it was the default for the service, and because we planned in the future to allow individuals to set their own specifications beyond the default, we intended the Adult Content category to be narrow. What we did not intend to include in the Adult Content category was LGBTQIA+ content. And yet, when it launched, we were horrified to receive reports that those sites were being filtered.

Choosing the Wrong Feed

So what went wrong? The data providers that we license content from have different categorizations; those categorizations do not line up perfectly between different providers. One of the providers has multiple "Adult Content" categories. One “Adult Content” category includes content that mirrors the Google SafeSearch/CIPA definition. Another “Adult Content” content category includes a broader set of topics, including LGBTQIA+ sites.

While we had specifically reviewed the Adult Content category to ensure that it was narrowly tailored to mirror the Google SafeSearch/CIPA definition, when we released the production version this morning we included the wrong “Adult Content” category from the provider in the build. As a result, the first users who tried saw a broader set of sites being filtered than was intended, including LGBTQIA+ content. We immediately worked to fix the issue.

Slow to Update Data Structures

In order to distribute the list of sites quickly to all our data centers we use a compact data structure. The upside is that we can replicate the data structure worldwide very efficiently. The downside is that generating a new version of the data structure takes several hours. The minute we saw that we'd made a mistake we pulled the incorrect data provider and began recreating the new data structure.

While the new data structure replicated across our network we pushed individual sites to an allow list immediately. We began compiling lists both from user reports as well as from other LGBTQIA+ resources. These updates went out instantly. We continuously added sites to the allow list as they were reported or we discovered them.

By 16:51 UTC, approximately two hours after we’d received the first report of the mistaken blocking, the data structure with the intended definition of Adult Content had been generated and we pushed it out live. The only users that would have seen over-broad blocking are those that had already switched to the service. Users of — which will remain unfiltered — and would not have experienced this inadvertent blocking.

As of now, the filtering provided by the default setting of is what we intended it to be, and should roughly match what you find if you use Google SafeSearch and LGBTQIA+ sites are not being blocked. If you see site being blocked that should not be, please report them to us here.

Protections for the Future

Going forward, we've set up a number of checks of known sites that should fall outside the intended categories, including many that we mistakenly listed today. Before defaults are updated in the future, our build system will confirm that none of these sites are listed. We hope this will help catch mistakes like this in the future.

I'm sorry for the error. While I understand how it happened, it should never have happened. I appreciate our team responding quickly to fix the mistake we made.


VMware seeks new Australian MD as Alister Dias departs for Asian role [The Register]

Scores new gig promoting software-defined data centres

VMware is hunting for a new vice-president and managing director for Australia and New Zealand.…


Japanese airline ANA spins out telepresence-bot startup for virus-avoiding medicos and fearful tourists [The Register]

Imagine an iPad running FaceTime clamped to a post stuck into a Roomba and you'll get the idea

Japanese airline ANA has spun out a startup to develop and sell “avatars” - robots that comprise a remote-controllable stand with iPad-like device running a Facetime-like app, to bring your face into a room.…


Amazon says it fired a guy for breaking pandemic rules. Same guy who organized a staff protest over a lack of coronavirus protection [The Register]

Wow, how convenient

On Monday, Amazon fired Chris Smalls, a worker at its Staten Island, New York, warehouse, who had organized a protest demanding more protection for workers amid the coronavirus outbreak.…


GNU Guix Wants To Replace The Linux-Libre Kernel With The Hurd Micro-Kernel [Phoronix]

Seemingly at first thinking it was just an April Fools' Day joke, but it turns out the GNU Guix developers responsible for their package manager and operating system are actually working to replace their Linux (GNU Linux-libre to be exact) kernel with GNU Hurd...


Microsoft finds itself in odd position of sparing elderly, insecure protocols: Grants stay of execution to TLS 1.0, 1.1 [The Register]

A few more months to get those servers upgraded 'in light of current global circumstances'

Microsoft has blinked once again and delayed disabling TLS 1.0 and 1.1 by default in its browsers until the latter part of 2020.…


For the past five years, every FBI secret spy court request to snoop on Americans has sucked, says watchdog [The Register]

Feeling secure? Sucker

Analysis  The FBI has not followed internal rules when applying to spy on US citizens for at least five years, according to an extraordinary report [PDF] by the Department of Justice’s inspector general.…


Access Analysis, GuardDuty and Inspector gadgets not enough? Here comes another AI-driven security tool for AWS [The Register]

What have you got for us, Detective?

Amazon's Detective has hit general availability, adding to a range of AWS security services, which at this point has become a little confusing.…


GTK 3.98.2 Released As Another Step Towards GTK4 [Phoronix]

GTK 3.98.2 is out as the latest development snapshot in the road to the overdue but much anticipated GTK 4.0...


Microsoft's PowerToys suite sprouts four new playthings with a final March emission [The Register]

Today's Window Walk is brought to you by the letters 'R' 'E' and 'G'

Microsoft has persevered with the quickfire release cadence of the toolbox of stuff that should really be built into Windows 10 in the form of PowerToys 0.16.…


Cloudflare family-friendly DNS service flubs first filtering foray: Vital LGBTQ, sex-ed sites blocked 'by mistake' [The Register]

For a biz that prides itself on not censoring the internet, it sure likes censoring the internet

Updated  Cloudflare, known for free speech advocacy, rolled out a self-styled family-friendly variation of its DNS service to block adult content – and ended up denying access to LGBTQ websites and sex education resources.…


Upstreaming LLVM's Fortran "Flang" Front-End Has Been Flung Back Further [Phoronix]

Upstreaming of LLVM's Fortran front-end developed as "f18" and being upstreamed with the Flang name was supposed to happen back in January. Three months later, the developers still are struggling to get the code into shape for integration...


Linux 5.7 Gets A Unified/User-Space-Access-Intended Accelerator Framework [Phoronix]

The Linux 5.7 crypto subsystem updates include new drivers...


Saturday Morning Breakfast Cereal - Rolling [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Fortunately, by the time grave robbery turns out to be illegal, they have enough venture capital that laws don't apply.

Today's News:


GCC 11 Will Likely Support Using LLVM's libc++ [Phoronix]

While GCC 10 isn't even out for a few more weeks, looking ahead to next year's GCC 11 release is already one interesting planned change...


Linux 5.7 Networking Changes Bring Qualcomm IPA, New Intel Driver Additions [Phoronix]

The networking changes for the Linux 5.7 kernel have already been merged and as usual there is a lot of new wired and wireless networking driver activity...


Introducing for Families [The Cloudflare Blog]

Introducing for Families

Two years ago today we announced, a secure, fast, privacy-first DNS resolver free for anyone to use. In those two years, has grown beyond our wildest imagination. Today, we process more than 200 billion DNS requests per day making us the second largest public DNS resolver in the world behind only Google.

Introducing for Families

Yesterday, we announced the results of the privacy examination. Cloudflare's business has never involved selling user data or targeted advertising, so it was easy for us to commit to strong privacy protections for We've also led the way supporting encrypted DNS technologies including DNS over TLS and DNS over HTTPS. It is long past time to stop transmitting DNS in plaintext and we're excited that we see more and more encrypted DNS traffic every day. for Families

Introducing for Families

Since launching, the number one request we have received is to provide a version of the product that automatically filters out bad sites. While can safeguard user privacy and optimize efficiency, it is designed for direct, fast DNS resolution, not for blocking or filtering content. The requests we’ve received largely come from home users who want to ensure that they have a measure of protection from security threats and can keep adult content from being accessed by their kids. Today, we're happy to answer those requests.

Introducing for Families

Introducing for Families — the easiest way to add a layer of protection to your home network and protect it from malware and adult content. for Families leverages Cloudflare's global network to ensure that it is fast and secure around the world. And it includes the same strong privacy guarantees that we committed to when we launched two years ago. And, just like, we're providing it for free and it’s for any home anywhere in the world.

Introducing for Families

Two Flavors: (No Malware) & (No Malware or Adult Content)

Introducing for Families for Families is easy to set up and install, requiring just changing two numbers in the settings of your home devices or network router: your primary DNS and your secondary DNS. Setting up for Families usually takes less than a minute and we've provided instructions for common devices and routers through the installation guide. for Families has two default options: one that blocks malware and the other that blocks malware and adult content. You choose which setting you want depending on which IP address you configure.

Malware Blocking Only
Primary DNS:
Secondary DNS:

Malware and Adult Content
Primary DNS:
Secondary DNS:

For IPv6 use:

Malware Blocking Only
Primary DNS: 2606:4700:4700::1112
Secondary DNS: 2606:4700:4700::1002

Malware and Adult Content
Primary DNS: 2606:4700:4700::1113
Secondary DNS: 2606:4700:4700::1003

Additional Configuration

Introducing for Families

In the coming months, we will provide the ability to define additional configuration settings for for Families. This will include options to create specific whitelists and blacklists of certain sites. You will be able to set the times of the day when categories, such as social media, are blocked and get reports on your household's Internet usage. for Families is built on top of the same site categorization and filtering technology that powers Cloudflare's Gateway product. With the success of Gateway, we wanted to provide an easy-to-use service that can help any home network be fast, reliable, secure, and protected from potentially harmful content.

Not A Joke

Most of Cloudflare's business involves selling services to businesses. However, we've made it a tradition every April 1 to launch a new consumer product that leverages our network to bring more speed, reliability, and security to every Internet user. While we make money selling to businesses, the products we launch at this time of the year are close to our hearts because of the broad impact they have for every Internet user.

Introducing for Families

This year, while many of us are confined to our homes, protecting our communities from COVID-19, and relying on our home networks more than ever it seemed especially important to launch for Families. We hope during these troubled times it will help provide a bit of peace of mind for households everywhere.

Announcing the Beta for WARP for macOS and Windows [The Cloudflare Blog]

Announcing the Beta for WARP for macOS and Windows
Announcing the Beta for WARP for macOS and Windows

Last April 1 we announced WARP — an option within the iOS and Android app to secure and speed up Internet connections. Today, millions of users have secured their mobile Internet connections with WARP.

While WARP started as an option within the app, it's really a technology that can benefit any device connected to the Internet. In fact, one of the most common requests we've gotten over the last year is support for WARP for macOS and Windows. Today we're announcing exactly that: the start of the WARP beta for macOS and Windows.

What's The Same: Fast, Secure, and Free

We always wanted to build a WARP client for macOS and Windows. We started with mobile because it was the hardest challenge. And it turned out to be a lot harder than we anticipated. While we announced the beta of with WARP on April 1, 2019 it took us until late September before we were able to open it up to general availability. We don't expect the wait for macOS and Windows WARP to be nearly as long.

The WARP client for macOS and Windows relies on the same fast, efficient Wireguard protocol to secure Internet connections and keep them safe from being spied on by your ISP. Also, just like WARP on the mobile app, the basic service will be free on macOS and Windows.

Announcing the Beta for WARP for macOS and Windows

WARP+ Gets You There Faster

We plan to add WARP+ support in the coming months to allow you to leverage Cloudflare's Argo network for even faster Internet performance. We will provide a plan option for existing WARP+ subscribers to add additional devices at a discount. In the meantime, existing WARP+ users will be among the first to be invited to try WARP for macOS and Windows. If you are a WARP+ subscriber, check your app over the coming weeks for a link to an invitation to try the new WARP for macOS and Windows clients.

If you're not a WARP+ subscriber, you can add yourself to the waitlist by signing up on the page linked below. We'll email as soon as it's ready for you to try.

Linux Support

We haven't forgotten about Linux. About 10% of Cloudflare's employees run Linux on their desktops. As soon as we get the macOS and Windows clients out we’ll turn our attention to building a WARP client for Linux.

Thank you to everyone who helped us make WARP fast, efficient, and reliable on mobile. It's incredible how far it's come over the last year. If you tried it early in the beta last year but aren't using it now, I encourage you to give it another try. We're looking forward to bringing WARP speed and security to even more devices.


Intel GCC Patches + PRM Update Adds SERIALIZE Instruction, Confirm Atom+Core Hybrid CPUs [Phoronix]

Intel has seemingly just updated their public programming reference manual as well as sending out some new patches to the GCC compiler for supporting new instructions on yet-to-be-released CPUs...


Linux 5.7 Graphics Driver Updates Enable Tiger Lake By Default, OLED Backlight Support [Phoronix]

The Linux 5.7 Direct Rendering Manager (DRM) updates have been submitted as the kernel graphics driver changes for this next kernel feature release. As usual, there is a lot of work especially on the Intel and AMD Radeon side while nothing was queued for the open-source NVIDIA (Nouveau) driver...


GhostBSD 20.03 Is Out As The Latest Monthly Update To This Desktop BSD [Phoronix]

If you are looking for a new desktop-friendly BSD with TrueOS being phased out, GhostBSD 20.03 is out as the promising desktop-focused OS based on FreeBSD and using the MATE desktop environment as a decent out-of-the-box experience...


Cloudflare now supports security keys with Web Authentication (WebAuthn)! [The Cloudflare Blog]

Cloudflare now supports security keys with Web Authentication (WebAuthn)!
Cloudflare now supports security keys with Web Authentication (WebAuthn)!

We’re excited to announce that Cloudflare now supports security keys as a two factor authentication (2FA) method for all users. Cloudflare customers now have the ability to use security keys on WebAuthn-supported browsers to log into their user accounts. We strongly suggest users configure multiple security keys and 2FA methods on their account in order to access their apps from various devices and browsers. If you want to get started with security keys, visit your account's 2FA settings.

Cloudflare now supports security keys with Web Authentication (WebAuthn)!

What is WebAuthn?

WebAuthn is a standardized protocol for authentication online using public key cryptography. It is part of the FIDO2 Project and is backwards compatible with FIDO U2F. Depending on your device and browser, you can use hardware security keys (like YubiKeys) or built-in biometric support (like Apple Touch ID) to authenticate to your Cloudflare user account as a second factor. WebAuthn support is rapidly increasing among browsers and devices, and we’re proud to join the growing list of services that offer this feature.

To use WebAuthn, a user registers their security key, or “authenticator”, to a supporting application, or “relying party” (in this case Cloudflare). The authenticator then generates and securely stores a public/private keypair on the device. The keypair is scoped to a specific domain and user account. The authenticator then sends the public key to the relying party, who stores it. A user may have multiple authenticators registered with the same relying party. In fact, it’s strongly encouraged for a user to do so in case an authenticator is lost or broken.

When a user logs into their account, the relying party will issue a randomly generated byte sequence called a “challenge”. The authenticator will prompt the user for “interaction” in the form of a tap, touch or PIN before signing the challenge with the stored private key and sending it back to the relying party. The relying party evaluates the signed challenge against the public key(s) it has stored associated with the user, and if the math adds up the user is authenticated! To learn more about how WebAuthn works, take a look at the official documentation.

Cloudflare now supports security keys with Web Authentication (WebAuthn)!

How is WebAuthn different from other 2FA methods?

There’s a lot of hype about WebAuthn, and rightfully so. But there are some common misconceptions about how WebAuthn actually works, so I wanted to take some time to explain why it’s so effective against various credential-based attacks.

First, WebAuthn relies on a “physical thing you have” rather than an app or a phone number, which makes it a lot harder for a remote attacker to impersonate a victim. This assumption prevents common exploits like SIM swapping, which is an attack used to bypass SMS-based verification. In contrast, an attacker physically (and cryptographically) cannot “impersonate” a hardware security key unless they have physical access to a victim’s unlocked device.

WebAuthn is also simpler and quicker to use compared to mobile app-based 2FA methods. Users often complain about the amount of time it takes to reach for their phone, open an app, and copy over an expiring passcode every time they want to log into an account. By contrast, security keys require a simple touch or tap on a piece of hardware that’s often attached to a device.

But where WebAuthn really shines is its particular resistance to phishing attacks. Phishing often requires an attacker to construct a believable fake replica of a target site. For example, an attacker could try to register cloudfare[.]com (notice the typo!) and construct a site that looks similar to the genuine cloudflare[.]com. The attacker might then try to trick a victim into logging into the fake site and disclosing their credentials. Even if the victim has mobile app TOTP authentication enabled, a sophisticated attacker can still proxy requests from the fake site to the genuine site and successfully authenticate as the victim. This is the assumption behind powerful man-in-the-middle tools like evilginx.

WebAuthn prevents users from falling victim to common phishing and man-in-the-middle attacks because it takes the domain name into consideration when creating user credentials. When an authenticator creates the public/private keypair, it is specifically scoped to a particular account and domain. So let’s say a user with WebAuthn configured navigates to the phishy cloudfare[.]com site. When the phishy site prompts the authenticator to sign its challenge, the authenticator will attempt to find credentials for that phishy site’s domain and, upon failing to find any, will error and prevent the user from logging in. This is why hardware security keys are among the most secure authentication methods in existence today according to research by Google.

Cloudflare now supports security keys with Web Authentication (WebAuthn)!

WebAuthn also has very strict privacy guarantees.  If a user authenticates with a biometric key (like Apple TouchID or Windows Hello), the relying party never receives any of that biometric data. The communication between authenticator and client browser is completely separate from the communication between client browser and relying party. WebAuthn also urges relying parties to not disclose user-identifiable information (like email addresses) during registration or authentication. This helps prevent replay or user enumeration attacks. And because credentials are strictly scoped to a particular relying party and domain, a malicious relying party won’t be able to gain information about other relying parties an authenticator has created credentials for in order to track a user’s various accounts.

Finally, WebAuthn is great for relying parties because they don’t have to store anything additionally sensitive about a user. The relying party simply stores a user’s public key. An attacker who gains access to the public key can’t do much with it because they won’t know the associated private key. This is markedly less risky than TOTP, where a relying party must use proper hygiene to store a TOTP secret seed from which all subsequent time-based user passcodes are generated.

Security isn’t always intuitive

Sometimes in the security industry we have the tendency to fixate on new and sophisticated attacks. But often it’s the same old “simple” problems that have the highest impact. Two factor authentication is a textbook case where the security industry largely believes a concept is trivial, but the average user still finds it confusing or annoying. WebAuthn addresses this problem because it’s quicker and more secure for the end user compared to other authentication methods. We think the trend towards security key adoption will continue to grow, and we’re looking forward to doing our part to help the effort.

Note: If you login to your Cloudflare user account with Single Sign-On (SSO), you will not have the option to use two factor authentication (2FA). This is because your SSO provider manages your 2FA methods. To learn more about Cloudflare’s 2FA offerings, please visit our support center.

Tuesday, 31 March



Announcing the Results of the Public DNS Resolver Privacy Examination [The Cloudflare Blog]

Announcing the Results of the Public DNS Resolver Privacy Examination
Announcing the Results of the Public DNS Resolver Privacy Examination

On April 1, 2018, we took a big step toward improving Internet privacy and security with the launch of the public DNS resolver — the Internet's fastest, privacy-first public DNS resolver. And we really meant privacy first. We were not satisfied with the status quo and believed that secure DNS resolution with transparent privacy practices should be the new normal. So we committed to our public resolver users that we would not retain any personal data about requests made using our resolver. We also built in technical measures to facilitate DNS over HTTPS to help keep your DNS queries secure. We’ve never wanted to know what individuals do on the Internet, and we took technical steps to ensure we can’t know.

We knew there would be skeptics. Many consumers believe that if they aren’t paying for a product, then they are the product. We don’t believe that has to be the case. So we committed to retaining a Big 4 accounting firm to perform an examination of our resolver privacy commitments.

Today we’re excited to announce that the resolver examination has been completed and a copy of the independent accountants’ report can be obtained from our compliance page.

The examination process

We gained a number of observations and lessons from the privacy examination of the resolver. First, we learned that it takes much longer to agree on terms and complete an examination when you ask an accounting firm to do what we believe is the first of its kind examination of custom privacy commitments for a recursive resolver.

We also observed that privacy by design works. Not that we were surprised -- we use privacy by design principles in all our products and services. Because we baked anonymization best practices into the resolver when we built it, we were able to demonstrate that we didn’t have any personal data to sell. More specifically, in accordance with RFC 6235, we decided to truncate the client/source IP at our edge data centers so that we never store in non-volatile storage the full IP address of the resolver user.

We knew that a truncated IP address would be enough to help us understand general Internet trends and where traffic is coming from. In addition, we also further improved our privacy-first approach by replacing the truncated IP address with the network number (the ASN) for our internal logs. On top of that, we committed to only retaining those anonymized logs for a limited period of time. It’s the privacy version of belt plus suspenders plus another belt.

Finally, we learned that aligning our examination of the resolver with our SOC 2 report most efficiently demonstrated that we had the appropriate change control procedures and audit logs in place to confirm that our IP truncation logic and limited data retention periods were in effect during the examination period. The resolver examination period of February 1, 2019, through October 31, 2019, was the earliest we could go back to while relying on our SOC 2 report.

Details on the examination

When we launched the resolver, we committed that we would not track what individual users of our resolver are searching for online. The examination validated that our system is configured to achieve what we think is the most important part of this commitment -- we never write the querying IP addresses together with the DNS query to disk and therefore have no idea who is making a specific request using the resolver. This means we don’t track which sites any individual visits, and we won't sell your personal data, ever.

We want to be fully transparent that during the examination we uncovered that our routers randomly capture up to 0.05% of all requests that pass through them, including the querying IP address of resolver users. We do this separately from the service for all traffic passing into our network and we retain such data for a limited period of time for use in connection with network troubleshooting and mitigating denial of service attacks.

To explain -- if a specific IP address is flowing through one of our data centers a large number of times, then it is often associated with malicious requests or a botnet. We need to keep that information to mitigate attacks against our network and to prevent our network from being used as an attack vector itself. This limited subsample of data is not linked up with DNS queries handled by the service and does not have any impact on user privacy.

We also want to acknowledge that when we made our privacy promises about how we would handle non-personally identifiable log data for resolver requests, we made what we now see were some confusing statements about how we would handle those anonymous logs.

For example, we learned that our blog post commitment about retention of anonymous log data was not written clearly enough and our previous statements were not as clear because we referred to temporary logs, transactional logs, and permanent logs in ways that could have been better defined. For example, our resolver privacy FAQs stated that we would not retain transactional logs for more than 24 hours but that some anonymous logs would be retained indefinitely. However, our blog post announcing the public resolver didn’t capture that distinction. You can see a clearer statement about our handling of anonymous logs on our privacy commitments page mentioned below.

With this in mind, we updated and clarified our privacy commitments for the resolver as outlined below. The most critical part of these commitments remains unchanged: We don’t want to know what you do on the Internet — it’s none of our business — and we’ve taken the technical steps to ensure we can’t.

Our public DNS resolver commitments

We have refined our commitments to resolver privacy as part of our examination effort. The nature and intent of our commitments remain consistent with our original commitments. These updated commitments are what was included in the examination:

  1. Cloudflare will not sell or share public resolver users’ personal data with third parties or use personal data from the public resolver to target any user with advertisements.
  2. Cloudflare will only retain or use what is being asked, not information that will identify who is asking it. Except for randomly sampled network packets captured from at most 0.05% of all traffic sent to Cloudflare’s network infrastructure, Cloudflare will not retain the source IP from DNS queries to the public resolver in non-volatile storage (more on that below). The randomly sampled packets are solely used for network troubleshooting and DoS mitigation purposes.
  3. A public resolver user’s IP address (referred to as the client or source IP address) will not be stored in non-volatile storage. Cloudflare will anonymize source IP addresses via IP truncation methods (last octet for IPv4 and last 80 bits for IPv6). Cloudflare will delete the truncated IP address within 25 hours.
  4. Cloudflare will retain only the limited transaction and debug log data (“Public Resolver Logs”) for the legitimate operation of our Public Resolver and research purposes, and Cloudflare will delete the Public Resolver Logs within 25 hours.
  5. Cloudflare will not share the Public Resolver Logs with any third parties except for APNIC pursuant to a Research Cooperative Agreement. APNIC will only have limited access to query the anonymized data in the Public Resolver Logs and conduct research related to the operation of the DNS system.

Proving privacy commitments

We created the resolver because we recognized significant privacy problems: ISPs, WiFi networks you connect to, your mobile network provider, and anyone else listening in on the Internet can see every site you visit and every app you use — even if the content is encrypted. Some DNS providers even sell data about your Internet activity or use it to target you with ads. DNS can also be used as a tool of censorship against many of the groups we protect through our Project Galileo.

If you use DNS-over-HTTPS or DNS-over-TLS to our resolver, your DNS lookup request will be sent over a secure channel. This means that if you use the resolver then in addition to our privacy guarantees an eavesdropper can’t see your DNS requests. We promise we won’t be looking at what you’re doing.

We strongly believe that consumers should expect their service providers to be able to show proof that they are actually abiding by their privacy commitments. If we were able to have our resolver privacy commitments examined by an independent accounting firm, we think other organizations can do the same. We encourage other providers to follow suit and help improve privacy and transparency for Internet users globally. And for our part, we will continue to engage well-respected auditing firms to audit our resolver privacy commitments. We also appreciate the work that Mozilla has undertaken to encourage entities that operate recursive resolvers to adopt data handling practices that protect the privacy of user data.

Details of the resolver privacy examination and our accountant’s opinion can be found on Cloudflare’s Compliance page.

Visit from any device to get started with the Internet's fastest, privacy-first DNS service.

PS Cloudflare has traditionally used tomorrow, April 1, to release new products. Two years ago we launched the free, fast, privacy-focused public DNS resolver. One year ago we launched WARP our way of securing and accelerating mobile Internet access.

And tomorrow?

Then three key changes
One before the weft, also
Safety to the roost

Monday, 30 March


Saturday Morning Breakfast Cereal - Performance Review [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

If you can find 3 more of you, I get an even bigger raise!

Today's News:


Introducing Quicksilver: Configuration Distribution at Internet Scale [The Cloudflare Blog]

Introducing Quicksilver: Configuration Distribution at Internet Scale

Cloudflare’s network processes more than fourteen million HTTP requests per second at peak for Internet users around the world. We spend a lot of time thinking about the tools we use to make those requests faster and more secure, but a secret-sauce which makes all of this possible is how we distribute configuration globally. Every time a user makes a change to their DNS, adds a Worker, or makes any of hundreds of other changes to their configuration, we distribute that change to 200 cities in 90 countries where we operate hardware. And we do that within seconds. The system that does this needs to not only be fast, but also impeccably reliable: more than 26 million Internet properties are depending on it. It also has had to scale dramatically as Cloudflare has grown over the past decade.

Historically, we built this system on top of the Kyoto Tycoon (KT) datastore. In the early days, it served us incredibly well. We contributed support for encrypted replication and wrote a foreign data wrapper for PostgreSQL. However, what worked for the first 25 cities was starting to show its age as we passed 100. In the summer of 2015 we decided to write a replacement from scratch. This is the story of how and why we outgrew KT, learned we needed something new, and built what was needed.

How KT Worked at Cloudflare

Where should traffic to be directed to?

What is the current load balancing weight of the second origin of this website?

Which pages of this site should be stored in the cache?

These are all questions which can only be answered with configuration, provided by users and delivered to our machines around the world which serve Internet traffic. We are massively dependent on our ability to get configuration from our API to every machine around the world.

It was not acceptable for us to make Internet requests on demand to load this data however, the data had to live in every edge location. The architecture of the edge which serves requests is designed to be highly failure tolerant. Each data center must be able to successfully serve requests even if cut off from any source of central configuration or control.

Our first large-scale attempt to solve this problem relied on deploying Kyoto-Tycoon (KT) to thousands of machines. Our centralized web services would write values to a set of root nodes which would distribute the values to management nodes living in every data center. Each server would eventually get its own copy of the data from a management node in the data center in which it was located:

Data flow from the API to data centres and individual machines

Introducing Quicksilver: Configuration Distribution at Internet Scale

Doing at least one read from KT was on the critical path of virtually every Cloudflare service. Every DNS or HTTP request would send multiple requests to a KT store and each TLS handshake would load certificates from it. If KT was down, many of our services were down, if KT was slow, many of our services were slow. Having a service like KT was something of a superpower for us, making it possible for us to deploy new services and trust that configuration would be fast and reliable. But when it wasn’t, we had very big problems.

As we began to scale one of our first fixes was to shard KT into different instances. For example, we put Page Rules in a different KT than DNS records. In 2015, we used to operate eight such instances storing a total of 100 million key-value (KV) pairs, with about 200 KV values changed per second. Across our infrastructure, we were running tens of thousands of these KT processes. Kyoto Tycoon is, to say the least, very difficult to operate at this scale. It’s clear we pushed it past its limits and for what it was designed for.

To put that in context it’s valuable to look back to the description of KT provided by its creators:

[It] is a lightweight datastore server with auto expiration mechanism, which is useful to handle cache data and persistent data of various applications. (

It seemed likely that we were not uncovering design failures of KT, but rather we were simply trying to get it to solve problems it was not designed for. Let’s take a deeper look at the issues we uncovered.

Exclusive write lock… or not?

To talk about the write lock it’s useful to start with another description provided by the KT documentation:

Functions of API are reentrant and available in multi-thread environment. Different database objects can be operated in parallel entirely. For simultaneous operations against the same database object, rwlock (reader-writer lock) is used for exclusion control. That is, while a writing thread is operating an object, other reading threads and writing threads are blocked. However, while a reading thread is operating an object, reading threads are not blocked. Locking granularity depends on data structures. The hash database uses record locking. The B+ tree database uses page locking.

On first glance this sounds great! There is no exclusive write lock over the entire DB. But there are no free lunches; as we scaled we started to detect poor performance when writing and reading from KT at the same time.

In the world of Cloudflare, each KT process replicated from a management node and was receiving from a few writes per second to a thousand writes per second. The same process would serve thousands of read requests per second as well. When heavy write bursts were happening we would notice an increase in the read latency from KT. This was affecting production traffic, resulting in slower responses than we expect of our edge.

Here are the percentiles for read latency of a script reading from KT the same 20 key/value pairs in an infinite loop. Each key and each value is 2 bytes.  We never update these key/value pairs so we always get the same value.

Without doing any writes, the read performance is somewhat acceptable even at high percentiles:

  • P99: 9ms
  • P99.9: 15ms

When we add a writer sequentially adding a 40kB value, however, things get worse. After running the same read performance test, our latency values have skyrocketed:

  • P99: 154ms
  • P99.9: 250ms

Adding 250ms of latency to a request through Cloudflare would never be acceptable. It gets even worse when we add a second writer, suddenly our latency at the 99.9th percentile (the 0.1% slowest reads) is over a second!

  • P99: 701ms
  • P99.9: 1215ms
Introducing Quicksilver: Configuration Distribution at Internet Scale

These numbers are concerning: writing more increases the read latency significantly. Given how many sites change their Cloudflare configuration every second, it was impossible for us to imagine a world where write loads would not be high. We had to track down the source of this poor performance. There must be either resource contention or some form of write locking, but where?

The Lock Hunt

After looking into the code the issue seems to be that when reading from KT, the function accept in the file kcplandb.h of Kyoto Cabinet acquires a lock:

Introducing Quicksilver: Configuration Distribution at Internet Scale

This lock is also acquired in the synchronize function of in charge of flushing data to disk:

Introducing Quicksilver: Configuration Distribution at Internet Scale

This is where we have a problem. Flushing to disk blocks all reads and flushing is slow.

So in theory the storage engine can handle parallel requests but in reality we found at least one place where this is not true. Based on this and other experiments and code review we came to the conclusion that KT was simply not designed for concurrent access. Due to the exclusive write lock implementation of KT, I/O writes degraded read latency to unacceptable levels.

In the beginning, the occurrence of that issue was rare and not a top priority. But as our customer base grew at rocket speed, all related datasets grew at the same pace. The number of writes per day was increasing constantly and this contention started to have an unacceptable impact on performance.

As you can imagine our immediate fix was to do less writes. We were able to make small-scale changes to writing services to reduce their load, but this was quickly eclipsed by the growth of the company and the launch of new products. Before we knew it, our write levels were right back to where they began!

As a final step we disabled the fsync which KT was doing on each write. This meant KT would only flush to disk on shutdown, introducing potential data corruption which required its own tooling to detect and repair.

Unsynchronized Swimming

Continuing our theme of beginning with the KT documentation, it’s worth looking at how they discuss non-durable writes:

If an application process which opened a database terminated without closing the database, it involves risks that some records may be missing and the database may be broken. By default, durability is settled when the database is closed properly and it is not settled for each updating operation.

At Cloudflare scale, kernel panics or even processor bugs happen and unexpectedly kill services. By turning off syncing to improve performance we began to experience database corruption. KT comes with a mechanism to repair a broken DB which we used successfully at first. Sadly, on our largest databases, the rebuilding process took a very long time and in many cases would not complete at all. This created a massive operational problem for our SRE team.

Ultimately we turned off the auto-repair mechanism so KT would not start if the DB was broken and each time we lost a database we copied it from a healthy node. This syncing was being done manually by our SRE team. That team’s time is much better spent building systems and investigating problems; the manual work couldn’t continue.

Not syncing to disk caused another issue: KT had to flush the entire DB when it was being shut down. Again, this worked fine at the beginning, but with the DB getting bigger and bigger, the shut down time started to sometimes hit the systemd grace period and KT was terminated with a SIGKILL. This led to even more database corruption.

Because all of the KT DB instances were growing at the same pace, this issue went from minor to critical seemingly overnight. SREs wasted hours syncing DBs from healthy instances before we understood the problem and greatly increased the grace period provided by systemd.

We also experienced numerous random instances of database corruption. Too often KT was shut down cleanly without any error but when restarted the DB was corrupted and had to be restored. In the beginning, with 25 data centers, it happened rarely. Over the years we added thousands of new servers to Cloudflare infrastructure and it was occurring multiple times a day.

Writing More

Most of our writes are adding new KV pairs, not overwriting or deleting. We can see this in our key count growth:

  1. In 2015, we had around 100 million KV pairs
  2. In 2017, we passed 200 million
  3. In 2018, we passed 500 million
  4. In 2019, we exceeded 1 billion
Introducing Quicksilver: Configuration Distribution at Internet Scale

Unfortunately in a world where the quantity of data is always growing, it’s not realistic to think you will never flush to disk. As we write new keys the page cache quickly fills. When it’s full, it is flushed to disk. I/O saturation was leading to the very same contention problems we experienced previously.

Each time KT received a heavy write burst, we could see the read latency from KT increasing in our DC. At that point it was obvious to us that the KT DB locking implementation could no longer do the job for us with or without syncing. Storage wasn’t our only problem however, the other key function of KT and our configuration system is replication.

The Best Effort Replication Protocol

KT replication protocol is based solely on timestamp. If a transaction fails to replicate for any reason but it is not detected, the timestamp will continue to advance forever missing that entry.

How can we have missing log entries? KT replicates data by sending an ordered list of transaction logs. Each log entry details what change is being made to our configuration database. These logs are kept for a period of time, but are eventually ‘garbage collected’, with old entries removed.

Let’s think about a KT instance being down for days, it then restarts and asks for the transaction log from the last one it got. The management node receiving the request will send the nearest entries to this timestamp, but there could be missing transaction logs due to garbage collection. The client would not get all the updates it should and this is how we quietly end up with an inconsistent DB.

Another weakness we noticed happens when the timestamp file is being written. Here is a snippet of the file where the client replication code is implemented:

Introducing Quicksilver: Configuration Distribution at Internet Scale

This code snippet runs the loop as long as replication is working fine. The timestamp file is only going to be written when the loop terminates. The call to write_rts (the function writing to disk the last applied transaction log) can be seen at the bottom of the screenshot.

If KT terminates unexpectedly in the middle of that loop, the timestamp file won’t be updated. When this KT restarts and if it successfully repairs the database it will replicate from the last value written to the rts file. KT could end up replaying days of transaction logs which were already applied to the DB and values written days ago could be made visible again to our services for some time before everything gets back up to date!

We also regularly experienced databases getting out of sync without any reason. Sometimes these caught up by themselves, sometimes they didn’t. We have never been able to properly identify the root cause of that issue. Distributed systems are hard, and distributed databases are brutal. They require extensive observability tooling to deploy properly which didn’t exist for KT.

Upgrading Kyoto Tycoon in Production

Multiple processes cannot access one database file at the same time. A database file is locked by reader-writer lock while a process is connected to it.

We release hundreds of software updates a day across our many engineering teams. However, we only very rarely deploy large-scale updates and upgrades to the underlying infrastructure which runs our code. This frequency has increased over time, but in 2015 we would do a “CDN Release” once per quarter.

To perform a CDN Release we reroute traffic from specific servers and take the time to fully upgrade the software running on those machines all the way down to the kernel. As this was only done once per quarter, it could take several months for an upgrade to a service like KT to make it to every machine.

Most Cloudflare services now implement a zero downtime upgrade mechanism where we can upgrade the service without dropping production traffic. With this we can release a new version of our web or DNS servers outside of a CDN release window, which allows our engineering teams to move much faster. Many of our services are now actually implemented with Cloudflare Workers which can be deployed even faster (using KT’s replacement!).

Unfortunately this was not the case in 2015 when KT was being scaled. Problematically, KT does not allow multiple processes to concurrently access the same database file so starting a new process while the previous one was still running was impossible. One idea was that we could stop KT, hold all incoming requests and start a new one. Unfortunately stopping KT would usually take over 15 minutes with no guarantee regarding the DB status.

Because stopping KT is very slow and only one KT process can access the DB, it was not possible to upgrade KT outside of a CDN release, locking us into that aging process.

High-ish Availability

One final quote from the KT documentation:

Kyoto Tycoon supports "dual master" replication topology which realizes higher availability. It means that two servers replicate each other so that you don't have to restart the survivor when one of them crashed.


Note that updating both of the servers at the same time might cause inconsistency of their databases. That is, you should use one master as a "active master" and the other as a "standby master".

Said in other words: When dual master is enabled all writes should always go to the same root node and a switch should be performed manually to promote the standby master when the root node dies.

Unfortunately that violates our principles of high availability. With no capability for automatic zero-downtime failover it wasn’t possible to handle the failure of the KT top root node without some amount of configuration propagation delay.

Building Quicksilver

Addressing these issues in Kyoto Tycoon wasn’t deemed feasible. The project had no maintainer, the last official update being from April 2012, and was composed of a code base of 100k lines of C++. We looked at alternative open source systems at the time, none of which fit our use case well.

Our KT implementation suffered from some fundamental limitations:

  1. No high availability
  2. Weak replication protocol
  3. Exclusive write lock
  4. Not zero downtime upgrade friendly

It was also unreliable, critical replication and database functionality would break quite often. At some point, keeping KT up in running at Cloudflare was consuming 48 hours of SRE time per week.

We decided to build our own replicated key value store tailored for our needs and we called it Quicksilver. As of today Quicksilver powers an average of 2.5 trillion reads each day with an average latency in microseconds.

Fun fact: the name Quicksilver was picked by John Graham-Cumming, Cloudflare’s CTO. The terrible secret that only very few members of the humankind know is that he originally named it “Velocireplicator”. It is a secret though. Don’t tell anyone. Thank you.

Storage Engine

One major complication with our legacy system, KT, was the difficulty of bootstrapping new machines. Replication is a slow way to populate an empty database, it’s much more efficient to be able to instantiate a new machine from a snapshot containing most of the data, and then only use replication to keep it up to date. Unfortunately KT required nodes to be shut down before they could be snapshotted, making this challenging. One requirement for Quicksilver then was to use a storage engine which could provide running snapshots. Even further, as Quicksilver is performance critical, a snapshot must also not have a negative impact on other services that read from Quicksilver. With this requirement in mind we settled on a datastore library called LMDB after extensive analysis of different options.LMDB’s design makes taking consistent snapshots easy. LMDB is also optimized for low read latency rather than write throughput. This is important since we serve tens of millions of reads per second across thousands of machines, but only change values relatively infrequently. In fact, systems that switched from KT to Quicksilver saw drastically reduced read response times, especially on heavily loaded machines. For example, for our DNS service, the 99th percentile of reads dropped by two orders of magnitude!

LMDB also allows multiple processes to concurrently access the same datastore. This is very useful for implementing zero downtime upgrades for Quicksilver: we can start the new version while still serving current requests with the old version. Many data stores implement an exclusive write lock which requires only a single user to write at a time, or even worse, restricts reads while a write is conducted. LMDB does not implement any such lock.

LMDB is also append-only, meaning it only writes new data, it doesn’t overwrite existing data. Beyond that, nothing is ever written to disk in a state which could be considered corrupted. This makes it crash-proof, after any termination it can immediately be restarted without issue. This means it does not require any type of crash recovery tooling.

Transaction Logs

LMDB does a great job of allowing us to query Quicksilver from each of our edge servers, but it alone doesn’t give us a distributed database. We also needed to develop a way to distribute the changes made to customer configurations into the thousands of instances of LMDB we now have around the world. We quickly settled on a fan-out type distribution where nodes would query master-nodes, who would in turn query top-masters, for the latest updates.

Introducing Quicksilver: Configuration Distribution at Internet Scale

Unfortunately there is no such thing as a perfectly reliable network or system. It is easy for a network to become disconnected or a machine to go down just long enough to miss critical replication updates. Conversely though, when users make changes to their Cloudflare configuration it is critical that they propagate accurately whatever the condition of the network. To ensure this, we used one of the oldest tricks in the book and included a monotonically increasing sequence number in our Quicksilver protocol:

< 0023 SET “hello” “world”
< 0024 SET “lorem” “ipsum”
< 0025 DEL “42”…

It is now easily possible to detect whether an update was lost, by comparing the sequence number and making sure it is exactly one higher than the last message we have seen. The astute reader will notice that this is simply a log. This process is pleasantly simple because our system does not need to support global writes, only reads. As writes are relatively infrequent and it is easy for us to elect a single data center to aggregate writes and ensure the monotonicity of our counter.

One of the most common failure modes of a distributed system are configuration errors. An analysis of how things could go wrong led us to realize that we had missed a simple failure case: since we are running separate Quicksilver instances for different kinds of data, we could corrupt a database by misconfiguring it. For example, nothing would prevent the DNS database from being updated with changes for the Page Rules database. The solution was to add unique IDs for each database and to require these IDs when initiating the replication protocol.

Through our experience with our legacy system, KT, we knew that replication does not always scale as easily as we would like. With KT it was common for us to saturate IO on our machines, slowing down reads as we tried to replay the replication log. To solve this with Quicksilver we decided to engineer a batch mode where many updates can be combined into a single write, allowing them all to be committed to disk at once. This significantly improved replication performance by reducing the number of disk writes we have to make. Today, we are batching all updates which occur in a 500ms window, and this has made highly-durable writes manageable.

Where should we store the transaction logs? For design simplicity we decided to store these within a different bucket of our LMDB database. By doing this, we can commit the transaction log and the update to the database in one shot. Originally the log was kept in a separate file, but storing it with the database simplifies the code.

Unfortunately this came at a cost: fragmentation. LMDB does not naturally fragment values, it needs to store every value in a sequential region of the disk. Eventually the disk begins to fill and the large regions which offer enough space to fit a particularly big value start to become hard to find. As our disks begin to fill up, it can take minutes for LMDB to find enough space to store a large value. Unfortunately we exacerbated this problem by storing the transaction log within the database.

This fragmentation issue was not only causing high write latency, it was also making the databases grow very quickly. When the reasonably-sized free spaces between values start to become filled, less and less of the disk becomes usable. Eventually the only free space on disk is too small to store any of the actual values we can store. If all of this space were compacted into a single region, however, there would be plenty of space available.

The compaction process requires rewriting an entire DB from scratch. This is something we do after bringing data centers offline and its benefits last for around 2 months, but it is far from a perfect solution. To do better we began fragmenting the transaction log into page-sized chunks in our code to improve the write performance. Eventually we will also split large values into chunks that are small enough for LMDB to happily manage, and we will handle assembling these chunks in our code into the actual values to be returned.

We also implemented a key-value level CRC. The checksum is written when the transaction log is applied to the DB and checks the KV pair is read. This checksum makes it possible to quickly identify and alert on any bugs in the fragmentation code. Within the QS team we are usually against this kind of defensive measure, we prefer focusing on code quality instead of defending against the consequences of bugs, but database consistency is so critical to the company that even we couldn’t argue against playing defense.

LMDB stability has been exceptional. It has been running in production for over three years. We have experienced only a single bug and zero data corruption. Considering we serve over 2.5 trillion read requests and 30 million write requests a day on over 90,000 database instances across thousands of servers, this is very impressive.


Transaction logs are a critical part of our replication system, but each log entry ends up being significantly larger than the size of the values it represents. To prevent our disk space from being overwhelmed we use Snappy to compress entries. We also periodically garbage collect entries, only keeping the most recent required for replication.

For safety purposes we also added an incremental hash within our transaction logs. The hash helps us to ensure that messages have not been lost or incorrectly ordered in the log.

One other potential misconfiguration which scares us is the possibility of a Quicksilver node connecting to, and attempting to replicate from, itself. To prevent this we added a randomly generated process ID which is also exchanged in the handshake:

type ClientHandshake struct {
	// Unique ID of the client. Verifies that client and master have distinct IDs. Defaults to a per-process unique ID.
	ID uuid.ID
	// ID of the database to connect to. Verifies that client and master have the same ID. Disabled by default.
	DBID uuid.ID
	// At which index to start replication. First log entry will have Index = LastIndex + 1.
	LastIndex logentry.Index
	// At which index to stop replication.
	StopIndex logentry.Index
	// Ignore LastIndex and retrieve only new log entries.
	NewestOnly bool
	// Called by Update when a new entry is applied to the db.
	Metrics func(*logentry.LogEntry)

Each Quicksilver instance has a list of primary servers and secondary servers. It will always try to replicate from a primary node which is often another QS node near it. There are a variety of reasons why this replication may not work, however. For example, if the target machine’s database is too old it will need a larger changeset than exists on the source machine. To handle this, our secondary masters store a significantly longer history to allow machines to be offline for a full week and still be correctly resynchronized on startup.


Building a system is often much easier than maintaining it. One challenge was being able to do weekly releases without stopping the service.

Fortunately the LMDB datastore supports multiple process reading and writing to the DB file simultaneously. We use Systemd to listen on incoming connections, and immediately hand the sockets over to our Quicksilver instance. When it’s time to upgrade we start our new instance, and pass the listening socket over to the new instance seamlessly.

We also control the clients used to query Quicksilver. By adding an automatic retry to requests we are able to ensure that momentary blips in availability don’t result in user-facing failures.

After years of experience maintaining this system we came to a surprising conclusion: handing off sockets is really neat, but it might involve more complexity than is warranted. A Quicksilver restart happens in single-digit milliseconds, making it more acceptable than we would have thought to allow connections to momentarily fail without any downstream effects. We are currently evaluating the wisdom of simplifying the update system, as in our experience simplicity is often the best true proxy for reliability.


It’s easy to overlook monitoring when designing a new system. We have learned however that a system is only as good as our ability to both know how well it is working, and our ability to debug issues as they arise. Quicksilver is as mission-critical as anything possibly can be at Cloudflare, it is worth the effort to ensure we can keep it running.

We use Prometheus for collecting our metrics and we use Grafana to monitor Quicksilver. Our SRE team uses a global dashboard, one dashboard for each datacenter, and one dashboard per server, to monitor its performance and availability. Our primary alerting is also driven by Prometheus and distributed using PagerDuty.

We have learned that detecting availability is rather easy, if Quicksilver isn’t available countless alerts will fire in many systems throughout Cloudflare. Detecting replication lag is more tricky, as systems will appear to continue working until it is discovered that changes aren’t taking effect. We monitor our replication lag by writing a heartbeat at the top of the replication tree and computing the time difference on each server.


Quicksilver is, on one level, an infrastructure tool. Ideally no one, not even most of the engineers who work here at Cloudflare, should have to think twice about it. On another level, the ability to distribute configuration changes in seconds is one of our greatest strengths as a company. It makes using Cloudflare enjoyable and powerful for our users, and it becomes a key advantage for every product we build. This is the beauty and art of infrastructure: building something which is simple enough to make everything built on top of it more powerful, more predictable, and more reliable.

We are planning on open sourcing Quicksilver in the near future and hope it serves you as well as it has served us. If you’re interested in working with us on this and other projects, please take a look at our jobs page.


Using data from spreadsheets in Fedora with Python [Fedora Magazine]

Python is one of the most popular and powerful programming languages available. Because it’s free and open source, it’s available to everyone — and most Fedora systems come with the language already installed. Python is useful for a wide variety of tasks, but among them is processing comma-separated value (CSV) data. CSV files often start off life as tables or spreadsheets. This article shows how to get started working with CSV data in Python 3.

CSV data is precisely what it sounds like. A CSV file includes one row of data at a time, with data values separated by commas. Each row is defined by the same fields. Short CSV files are often easily read and understood. But longer data files, or those with more fields, may be harder to parse with the naked eye, so computers work better in those cases.

Here’s a simple example where the fields are Name, Email, and Country. In this example, the CSV data includes a field definition as the first row, although that is not always the case.

John Q. Smith,,USA
Petr Novak,,CZ
Bernard Jones,,UK

Reading CSV from spreadsheets

Python helpfully includes a csv module that has functions for reading and writing CSV data. Most spreadsheet applications, both native like Excel or Numbers, and web-based such as Google Sheets, can export CSV data. In fact, many other services that can publish tabular reports will also export as CSV (PayPal for instance).

The Python csv module has a built in reader method called DictReader that can deal with each data row as an ordered dictionary (OrderedDict). It expects a file object to access the CSV data. So if our file above is called example.csv in the current directory, this code snippet is one way to get at this data:

f = open('example.csv', 'r')
from csv import DictReader
d = DictReader(f)
data = []
for row in d:

Now the data object in memory is a list of OrderedDict objects :

[OrderedDict([('Name', 'John Q. Smith'),
               ('Email', ''),
               ('Country', 'USA')]),
  OrderedDict([('Name', 'Petr Novak'),
               ('Email', ''),
               ('Country', 'CZ')]),
  OrderedDict([('Name', 'Bernard Jones'),
               ('Email', ''),
               ('Country', 'UK')])]

Referencing each of these objects is easy:

>>> print(data[0]['Country'])
>>> print(data[2]['Email'])

By the way, if you have to deal with a CSV file with no header row of field names, the DictReader class lets you define them. In the example above, add the fieldnames argument and pass a sequence of the names:

d = DictReader(f, fieldnames=['Name', 'Email', 'Country'])

A real world example

I recently wanted to pick a random winner from a long list of individuals. The CSV data I pulled from spreadsheets was a simple list of names and email addresses.

Fortunately, Python also has a helpful random module good for generating random values. The randrange function in the Random class from that module was just what I needed. You can give it a regular range of numbers — like integers — and a step value between them. The function then generates a random result, meaning I could get a random integer (or row number!) back within the total number of rows in my data.

So this small program worked well:

from csv import DictReader
from random import Random

d = DictReader(open('mydata.csv'))
data = []
for row in d:

r = Random()
winner = data[r.randrange(0, len(data), 1)]
print('The winner is:', winner['Name'])
print('Email address:', winner['Email'])

Obviously this example is extremely simple. Spreadsheets themselves include sophisticated ways to analyze data. However, if you want to do something outside the realm of your spreadsheet app, Python may be just the trick!

Photo by Isaac Smith on Unsplash.

Sunday, 29 March


Saturday Morning Breakfast Cereal - Wishes [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Really, the optimal wish is to be perfectly satisfied with your remaining wishes.

Today's News:

Saturday, 28 March


Saturday Morning Breakfast Cereal - Strata [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

I'm suddenly terribly afraid that this is actually a me-specific behavior and I've just outed myself.

Today's News:


Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement [The Cloudflare Blog]

Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement
Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement

It’s never been more crucial to help remote workforces stay fully operational — for the sake of countless individuals, businesses, and the economy at large. In light of this, Cloudflare recently launched a program that offers our Cloudflare for Teams suite for free to any company, of any size, through September 1. Some of these firms have been curious about how Cloudflare itself uses these tools.

Here’s how Cloudflare’s next-generation VPN alternative, Cloudflare Access, came to be.

Rewind to 2015. Back then, as with many other companies, all of Cloudflare’s internally-hosted applications were reached via a hardware-based VPN. When one of our on-call engineers received a notification (usually on their phone), they would fire up a clunky client on their laptop, connect to the VPN, and log on to Grafana.

It felt a bit like solving a combination lock with a fire alarm blaring overhead.

Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement

But for three of our engineers enough was enough. Why was a cloud network security company relying on clunky on-premise hardware?

And thus, Cloudflare Access was born.

A Culture of Dogfooding

Many of the products Cloudflare builds are a direct result of the challenges our own team is looking to address, and Access is a perfect example. Development on Access originally began in 2015, when the project was known internally as EdgeAuth.

Initially, just one application was put behind Access. Engineers who received a notification on their phones could tap a link and, after authenticating via their browser, they would immediately have access to the key details of the alert in Grafana. We liked it a lot — enough to get excited about what we were building.

Access solved a variety of issues for our security team as well. Using our identity provider of choice, we were able to restrict access to internal applications at L7 using Access policies. This once onerous process of managing access control at the network layer with a VPN was replaced with a few clicks in the Cloudflare dashboard.

Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement

After Grafana, our internal Atlassian suite including Jira and Wiki, and hundreds of other internal applications, the Access team began working to support non-HTTP based services. Support for git allowed Cloudflare’s developers to securely commit code from anywhere in the world in a fully audited fashion. This made Cloudflare’s security team very happy. Here’s a slightly modified example of a real authentication event that was generated while pushing code to our internal git repository.

Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement

It didn’t take long for more and more of Cloudflare’s internal applications to make their way behind Access. As soon as people started working with the new authentication flow, they wanted it everywhere. Eventually our security team mandated that we move our apps behind Access, but for a long time it was totally organic: teams were eager to use it.

Incidentally, this highlights a perk of utilizing Access: you can start by protecting and streamlining the authentication flows for your most popular internal tools — but there’s no need for a wholesale rip-and-replace. For organizations that are experiencing limits on their hardware-based VPNs, it can be an immediate salve that is up and running after just one setup call with a Cloudflare onboarding expert (you can schedule a time here).

That said, there are some upsides to securing everything with Access.

Supporting a Global Team

VPNs are notorious for bogging down Internet connections, and the one we were using was no exception. When connecting to internal applications, having all of our employees’ Internet connections pass through a standalone VPN was a serious performance bottleneck and single point of failure.

Dogfooding from Home: How Cloudflare Built our Cloud VPN Replacement

Cloudflare Access is a much saner approach. Authentication occurs at our network edge, which extends to 200 cities in over 90 countries globally. Rather than having all of our employees route their network traffic through a single network appliance, employees connecting to internal apps are connecting to a data center just down the road instead.

As we support a globally-distributed workforce, our security team is committed to protecting our internal applications with the most secure and usable authentication mechanisms.

With Cloudflare Access we’re able to rely on the strong two-factor authentication mechanisms of our identity provider, which was much more difficult to do with our legacy VPN.

On-Boarding and Off-Boarding with Confidence

One of the trickiest things for any company is ensuring everyone has access to the tools and data they need — but no more than that. That’s a challenge that becomes all the more difficult as a team scales. As employees and contractors leave, it is similarly essential to ensure that their permissions are swiftly revoked.

Managing these access controls is a real challenge for IT organizations around the world — and it’s greatly exacerbated when each employee has multiple accounts strewn across different tools in different environments. Before using Access, our team had to put in a lot of time to make sure every box was checked.

Now that Cloudflare’s internal applications are secured with Access, on- and offboarding is much smoother. Each new employee and contractor is quickly granted rights to the applications they need, and they can reach them via a launchpad that makes them readily accessible. When someone leaves the team, one configuration change gets applied to every application, so there isn’t any guesswork.

Access is also a big win for network visibility. With a VPN, you get minimal insight into the activity of users on the network – you know their username and IP address. but that’s about it. If someone manages to get in, it’s difficult to retrace their steps.

Cloudflare Access is based on a zero-trust model, which means that every packet is authenticated. It allows us to assign granular permissions via Access Groups to employees and contractors. And it gives our security team the ability to detect unusual activity across any of our applications, with extensive logging to support analysis. Put simply: it makes us more confident in the security of our internal applications.

But It’s Not Just for Us

With the massive transition to a remote work model for many organizations, Cloudflare Access can make you more confident in the security of your internal applications — while also driving increased productivity in your remote employees. Whether you rely on Jira, Confluence, SAP or custom-built applications, it can secure those applications and it can be live in minutes.

Cloudflare has made the decision to make Access completely free to all organizations, all around the world, through September 1. If you’d like to get started, follow our quick start guide here:
Or, if you’d prefer to onboard with one of our specialists, schedule a 30 minute call at this link:

Migrating from VPN to Access [The Cloudflare Blog]

Migrating from VPN to Access
Migrating from VPN to Access

With so many people at Cloudflare now working remotely, it's worth stepping back and looking at the systems we use to get work done and how we protect them. Over the years we've migrated from a traditional "put it behind the VPN!" company to a modern zero-trust architecture. Cloudflare hasn’t completed its journey yet, but we're pretty darn close. Our general strategy: protect every internal app we can with Access (our zero-trust access proxy), and simultaneously beef up our VPN’s security with Spectrum (a product allowing the proxying of arbitrary TCP and UDP traffic, protecting it from DDoS).

Before Access, we had many services behind VPN (Cisco ASA running AnyConnect) to enforce strict authentication and authorization. But VPN always felt clunky: it's difficult to set up, maintain (securely), and scale on the server side. Each new employee we onboarded needed to learn how to configure their client. But migration takes time and involves many different teams. While we migrated services one by one, we focused on the high priority services first and worked our way down. Until the last service is moved to Access, we still maintain our VPN, keeping it protected with Spectrum.

Some of our services didn't run over HTTP or other Access-supported protocols, and still required the use of the VPN: source control (git+ssh) was a particular sore spot. If any of our developers needed to commit code they'd have to fire up the VPN to do so. To help in our new-found goal to kill the pinata, we introduced support for SSH over Access, which allowed us to replace the VPN as a protection layer for our source control systems.

Over the years, we've been whittling away at our services, one-by-one. We're nearly there, with only a few niche tools remaining behind the VPN and not behind Access. As of this year, we are no longer requiring new employees to set up VPN as part of their company onboarding! We can see this in our Access logs, with more users logging into more apps every month:

Migrating from VPN to Access

During this transition period from VPN to Access, we've had to keep our VPN service up and running. As VPN is a key tool for people doing their work while remote, it's extremely important that this service is highly available and performant.

Enter Spectrum: our DDoS protection and performance product for any TCP and UDP-based protocol. We put Spectrum in front of our VPN very early on and saw immediate improvement in our security posture and availability, all without any changes in end-user experience.

With Spectrum sitting in front of our VPN, we now use the entire Cloudflare edge network to protect our VPN endpoints against DDoS and improve performance for VPN end-users.

Setup was a breeze, with only minimal configuration needed:

Migrating from VPN to Access

Cisco AnyConnect uses HTTPS (TCP) to authenticate, after which the actual data is tunneled using a DTLS encrypted UDP protocol.

Although configuration and setup was a breeze, actually getting it to work was definitely not. Our early users quickly noted that although authenticating worked just fine, they couldn’t actually see any data flowing through the VPN. We quickly realized our arch nemesis, the MTU (maximum transmission unit) was to blame. As some of our readers might remember, we have historically always set a very small MTU size for IPv6. We did this because there might be IPv6 to IPv4 tunnels in between eyeballs and our edge. By setting it very low we prevented PTB (packet too big) packets from ever getting sent back to us, which causes problems due to our ECMP routing inside our data centers. But with a VPN, you always increase the packet size due to the VPN header. This means that the 1280 MTU that we had set would never be enough to run a UDP-based VPN. We ultimately settled on an MTU of 1420, which we still run today and allows us to protect our VPN entirely using Spectrum.

Over the past few years this has served us well, knowing that our VPN infrastructure is safe and people will be able to continue to work remotely no matter what happens. All in all this has been a very interesting journey, whittling down one service at a time, getting closer and closer to the day we can officially retire our VPN. To us, Access represents the future, with Spectrum + VPN to tide us over and protect our services until they’ve migrated over. In the meantime, as of the start of 2020, new employees no longer get a VPN account by default!

Friday, 27 March


Using Cloudflare to secure your cardholder data environment [The Cloudflare Blog]

Using Cloudflare to secure your cardholder data environment
Using Cloudflare to secure your cardholder data environment

As part of our ongoing compliance efforts Cloudflare’s PCI scope is periodically reviewed (including after any significant changes) to ensure all in-scope systems are operating in accordance with the PCI DSS. This review also allows us to periodically review each product we offer as a PCI validated service provider and identify where there might be opportunities to provide greater value to our customers.

Building trust in our products is one critical component that allows Cloudflare’s mission of “Helping to build a better Internet” to succeed. We reaffirm our dedication to building trust in our products by obtaining industry standard security compliance certifications and complying with regulations.

Cloudflare is a Level 1 Merchant, the highest level, and also provides services to organizations to help secure their cardholder data environment. Maintaining PCI DSS compliance is important for Cloudflare because (1) we must ensure that our transmission and processing of cardholder data is secure for our own customers, (2) that our customers know they can trust Cloudflare’s products to transmit cardholder data securely, and (3) that anyone who interacts with Cloudflare’s services know that their information is transmitted securely.

The PCI standard applies to any company or organization that accepts credit cards, debit cards, or even prepaid cards for payment. The purpose of this compliance standard is to help protect financial institutions and customers from having their payment card information compromised. Each major payment card brand has merchants sorted into different tiers based on the number of transactions made per year, and each tier requires varying requirements to satisfy their compliance obligations. Annually, Cloudflare undergoes an assessment by a Qualified Security Assessor. This assessor conducts a thorough review of Cloudflare’s technical environment and validates that Cloudflare’s controls related to securing the transmission, processing, and storage of cardholder data meet the requirements in the PCI Data Security Standard (PCI DSS).

Cloudflare has been PCI compliant since 2014 as both a merchant and as a service provider, but this year we have expanded our Service Provider scope to include more products that will help our customers become more secure and meet their own compliance obligations.

How can Cloudflare Help You?

In addition to our WAF, we are proud to announce that Cloudflare’s Content Delivery Network, Cloudflare Access, and the Cloudflare Time Service are also certified under our latest Attestation of Compliance!

Our Attestation of Compliance is applicable for all Pro, Business, and Enterprise accounts. This designation can be used to simplify your PCI audits and remove the pressure on you to manage these services or appliances locally.

If you use our WAF, enable the OWASP ruleset, and tune rules for your environment you will meet the need to protect web-facing applications and satisfy PCI requirement 6.6.

As detailed by several recent blog posts, Cloudflare Access is changing the game and your relationship with your corporate VPN. Many organizations rely on VPNs and other segmentation tools to reduce the scope of their cardholder data environment. Cloudflare Access provides another means of segmentation by using Cloudflare’s global network as a VPN service to access internal resources. Additionally, these sessions can be configured to time out after 15 minutes of inactivity to help customers meet requirement 8.1.8!

There are several large providers of time services that most organizations use. However, in 2019 Cloudflare announced our NTP service. The benefits of using our time service rely on the use of our CDN and our global network to provide an advantage in latency and accuracy. Our 200 locations around the world all use anycast to route your packets to our closest server. All of our servers are synchronized with stratum 1 time service providers, and then offer NTP to the general public, similar to how other public NTP providers function. Accurate time services are critical to maintaining accurate audit logging and being able to respond to incidents. By changing your time source to we can help you meet requirement 10.4.3.

Finally, Cloudflare has given our customers the opportunity to configure higher levels of TLS. Currently, you can enable up to TLS 1.3 within your Cloudflare Dash, which exceeds the requirement to use the latest versions of TLS 1.1 or higher referenced in requirement 4.1!

We use our own products to secure our cardholder data environment and hope that our customers will find these product additions as beneficial and easy to implement as we have.

Learn more about Compliance at Cloudflare

Cloudflare is committed to helping our customers earn their user’s trust by ensuring our products are secure. The Security team is committed to adhering to security compliance certifications and regulations that maintain the security, confidentiality, and availability of company and client information.

In order to help our customers keep track of the latest certifications, Cloudflare continually updates our Compliance certification page - Today, you can view our status on all compliance certifications and download our SOC 3 report.

Thursday, 26 March


Saturday Morning Breakfast Cereal - Quarantine [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

I swear, Onanymous is (1) not a typo, and (2) comedy gold.

Today's News:


Migrating to React land: Gatsby [The Cloudflare Blog]

Migrating to React land: Gatsby
Migrating to React land: Gatsby

I am an engineer that loves docs. Well, OK, I don’t love all docs but I believe docs are a crucial, yet often neglected element to a great developer experience. I work on the developer experience team for Cloudflare Workers focusing on several components of Workers, particularly on the docs that we recently migrated to Gatsby.

Through porting our documentation site to Gatsby I learned a lot. In this post, I share some of the learnings that could’ve saved my former self from several headaches. This will hopefully help others considering a move to Gatsby or another static site generator.

Why Gatsby?

Prior to our migration to Gatsby, we used Hugo for our developer documentation. There are a lot of positives about working with Hugo - fast build times, fast load times - that made building a simple static site a great use case for Hugo. Things started to turn sour when we started making our docs more interactive and expanding the content being generated.

Going from writing JSX with TypeScript back to string-based templating languages is difficult. Trying to perform complicated tasks, like generating a sidebar, cost me - a developer who knows nothing about liquid code or Go templating (though with Golang experience) - several tears not even to implement but to just understand what was happening.

Here is the code to template an item in the sidebar in Hugo:

<!-- templates -->
{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}
      <a href="{{ .RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if }}
          <span class="new-badge">NEW</span>
        {{ end }}
        {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
        {{ end }}
      {{ if ne $numberOfPages 0 }}
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
      {{ end }}
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if }}
          <span class="new-badge">NEW</span>
        {{ end }}
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

Whoa. I may be exceptionally oblivious, but I had to squint at the snippet above for an hour before I realized this was the code for a sidebar item (the li element was the eventual giveaway, but took some parsing to discover where the logic actually started).

(Disclaimer: I am in no way a pro at Hugo and in any situation there are always several ways to code a solution; thus I am in no way claiming this was the only way to write the template nor am I chastising the author of the code. I am just displaying the differences in pieces of code I came across)

Now, here is what the TSX (I will get into the JS later in the article) for the Gatsby project using the exact same styling would look like:

 <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
   <Link className="" to={pathToServe} title="Docs Home" activeClassName="active">
     {title || 'No title'}
     {numberOfPages ? <Triangle isAncestor={isAncestor} alwaysopen={showChildren} /> : ''}
     {showNew ? <span className="new-badge">NEW</span> : ''}
   {showChildren ? (
       {' '}
       { mdx) => {
         return (
   ) : (

This code is clean and compact because Gatsby is a static content generation tool based on React. It’s loved for a myriad of reasons, but my honest main reason to migrate to it was to make the Hugo code above much less ugly.

For our purposes, less ugly was important because we had dreams of redesigning our docs to be interactive with support for multiple coding languages and other features.

For example, the template gallery would be a place to go to for how-to recipes and examples. The templates themselves would live in a template registry service and turn into static pages via an API.

We wanted the docs to not be constrained by Go templating. The Hugo docs admit their templates aren’t the best for complicated logic:

Go Templates provide an extremely simple template language that adheres to the belief that only the most basic of logic belongs in the template or view layer.

Gatsby and React enable the more complex logic we were looking for. After our team built and Built with Workers on Gatsby, I figured this was my shot to really give Gatsby a try on our Workers developer docs.

Decision to Migrate over Starting from Scratch

I’m normally not a fan of fixing things that aren’t broken. Though I didn’t like working with Hugo, did love working in React, and had all the reasons to. I was timid about being the one in charge of switching from Hugo. I was scared. I hated looking at the liquid code of Go templates. I didn’t want to have to port all the existing templates to React without truly understanding what I might be missing.

There comes a point with tech debt though where you have to tackle the tech debt you are most scared of.

The easiest solution would be of course to throw the Hugo code away. Start from scratch. A clean slate. But this means taking something that was not broken and breaking it. The styling, SEO, tagging, and analytics of the site took small iterations over the course of a few years to get right and I didn’t want to be the one to break them. Instead of throwing all the styling and logic tied in for search, SEO, etc..., our plan was to maintain as much of the current design and logic as possible while converting it to React piece-by-piece, component-by-component.

Also there were existing developer docs still using Hugo on Cloudflare by other teams (e.g. Access, Argo Tunnel, etc...). I wanted a team at Cloudflare to be able to import their existing markdown files with frontmatter into the Gatsby repo and preserve the existing design.

I wanted to migrate instead of teleport to Gatsby.

How-to: Hugo to Gatsby

In this blog post, I go through some but not all of the steps of how I ported to Gatsby from Hugo for our complex doc site. The few examples here help to convey the issues that caused the most pain.

Let’s start with getting the markdown files to turn into HTML pages.


One goal was to keep all the existing markdown and frontmatter we had set up in Hugo as similar as possible. The reasoning for this was to not break existing content and also maintain the version history of each doc.

Gatsby is built on top of GraphQL. All the data and most all content for Gatsby is put into GraphQL during startup likely via a plugin, then Gatsby will query for this data upon actual page creation. This is quite different from Hugo’s much more abstract model of putting all your content in a folder named content and then Hugo figures out which template to apply based on the logic in the template.

MDX is a sophisticated tool that parses markdown into Gatsby so it can later be represented as HTML (it actually can do much more than that but, I won’t get into it here). I started with Gatsby’s MDX plugin to create nodes from my markdown files. Here is the code to set up the plugin to get all the markdown files (files ending in .md and .mdx) I had in the src/content folder into GraphQL:


const path = require('path')
module.exports = {
 plugins: [
     resolve: `gatsby-source-filesystem`,
     options: {
       name: `mdx-pages`,
       path: `${__dirname}/src/content`,
       ignore: [`**/CONTRIBUTING*`, '/styles/**'],
     resolve: `gatsby-plugin-mdx`,
     options: {
       extensions: [`.mdx`, `.md`],

Now that Gatsby knows about these files as nodes, we can create pages for them. In gatsby-node.js, I tell Gatsby to grab these MDX pages and use a template markdownTemplate.tsx to create pages for them:

const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ actions, GraphQL, reporter }) => {
 const { createPage } = actions
 const markdownTemplate = path.resolve(`src/templates/markdownTemplate.tsx`)
 result = await GraphQL(`
     allMdx(limit: 1000) {
       edges {
         node {
           fields {
           frontmatter {
 // Handle errors
 if (result.errors) {
   reporter.panicOnBuild(`Error while running GraphQL query.`)
 }{ node }) => {
   return createPage({
     path: node.fields.pathToServe,
     component: markdownTemplate,
     context: {
       parent: node.fields.parent,
       weight: node.frontmatter.weight,
     }, // additional data can be passed via context, can use as variable on query
exports.onCreateNode = ({ node, getNode, actions }) => {
 const { createNodeField } = actions
 // Ensures we are processing only markdown files
 if (node.internal.type === 'Mdx') {
   // Use `createFilePath` to turn markdown files in our `content` directory into `/workers/`pathToServe
   const originalPath = node.fileAbsolutePath.replace(
   let pathToServe = createFilePath({
     basePath: 'content/',
   let parentDir = path.dirname(pathToServe)
   if (pathToServe.includes('index')) {
     pathToServe = parentDir
     parentDir = path.dirname(parentDir) // "/" dirname will = "/"
   pathToServe = pathToServe.replace(/\/+$/, '/') // always end the path with a slash
   // Creates new query'able field with name of 'pathToServe', 'parent'..
   // for allMdx edge nodes
     name: 'pathToServe',
     value: `/workers${pathToServe}`,
     name: 'parent',
     value: parentDir,
     name: 'filePath',
     value: originalPath,

Now every time Gatsby runs, it starts running through each node on onCreateNode. If the node is MDX, it passes the node’s content (the markdown, fileAbsolutePath, etc.) and all the node fields (filePath, parent and pathToServe) to the markdownTemplate.tsx component so that the component can render the appropriate information for that markdown file.

The barebone component for a page that renders a React component from the MDX node looks like this:


import React from "react"
import { graphql } from "gatsby"
import { MDXRenderer } from "gatsby-plugin-mdx"
export default function PageTemplate({ data: { mdx } }) {
 return (
export const pageQuery = graphql`
 query BlogPostQuery($id: String) {
   mdx(id: { eq: $id }) {
     frontmatter {

A Complex Component: Sidebar

Now let’s get into where I wasted the most time, but learned hard lessons upfront: turning the Hugo template into a React component. At the beginning of this article, I showed that scary sidebar.

To set up the li element we had the Hugo logic looks like:

{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}

I see that the code is defining some section-tree-nav component-like thing and taking in some currentNode. To be honest, I still don’t know exactly what the variables .sect, IsSection, Params.head, Params.Hidden mean. Although I can take a wild guess, they're not that important for understanding what the logic is doing. The logic is setting the classes on the li element which is all I really care about: parent, always-open and active.

When focusing on those three classes, we can port them to React in a much more readable way by defining a variable string ddClass:

 let ddClass = ''
 let isAncestor = numberOfPages > 0
 if (isAncestor) {
   ddClass += ' parent'
 if (frontmatter.alwaysopen) {
   ddClass += ' parent alwaysOpen'
 return (
     {({ location }) => {
       const currentPathActive = location.pathname === pathToServe
       if (currentPathActive) {
         ddClass += ' active'
       return (
         <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>

There are actually a few nice things about the Hugo code, I admit. Using the Location component in React was probably less intuitive than Hugo’s ability to access currentNode to get the active page. Also isAncestor is predefined in Hugo as Whether the current page is an ancestor of the given page. For me though, having to track down the definitions of the predefined variables was frustrating and I appreciate the local explicitness of the definition, but I admit I’m a bit jaded.


The most complex part of the sidebar is getting the children. Now this is a story that really gets me starting to appreciate GraphQL.

Here’s getting the children for the sidebar in Hugo:

    {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
        {{ end }}
      {{ if ne $numberOfPages 0 }}
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
      {{ end }}
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if }}
          <span class="new-badge">NEW</span>
        {{ end }}
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

This is just the first layer of children. No grandbabies, sorry. And I won’t even get into all that is going on there exactly. When I started porting this over, I realized a lot of that logic was not even being used.

In React, we grab all the markdown pages and see which have parents that match the current page:

 const topLevelMarkdown: markdownRemarkEdge[] = useStaticQuery(
       allMdx(limit: 1000) {
         edges {
           node {
             frontmatter {
             fields {
 const myChildren: mdx[] = topLevelMarkdown
     edge =>
       fields.pathToServe === '/workers' + edge.node.fields.parent &&
       fields.pathToServe !== edge.node.fields.pathToServe
   .map(child => child.node)
   .filter(child => !child.frontmatter.hidden)
 const numberOfPages = myChildren.length

And then we render the children, so the full JSX becomes:

<li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
     title="Docs Home"
     {title || 'No title'}
     {numberOfPages ? (
       <Triangle isAncestor={isAncestor} alwaysopen={showChildren} />
     ) : (
     {showNew ? <span className="new-badge">NEW</span> : ''}
   {showChildren ? (
       {' '}
       { mdx) => {
         return (
   ) : (

Ok now that we have a component, and we have Gatsby creating the pages off the markdown, I can go back to my PageTemplate component and render the sidebar:

import Sidebar from './Sidebar'
export default function PageTemplate({ data: { mdx } }) {
 return (
     <Sidebar />

I don’t have to pass any props to Sidebar because the GraphQL static query in Sidebar.tsx gets all the data about all the pages that I need. I don’t even maintain state because Location is used to determine which path is active. Gatsby generates pages using the above component for each page that’s a markdown MDX node.

Wrapping up

This was just the beginning of the full migration to Gatsby. I repeated the process above for turning templates, partials, and other HTML component-like parts in Hugo into React, which was actually pretty fun, though turning vanilla JS that once manipulated the DOM into React would probably be a nightmare if I wasn’t somewhat comfortable working in React.

Main lessons learned:

  • Being careful about breaking things and being scared to break things are two very different things. Being careful is good; being scared is bad. If I were to complete this migration again, I would’ve used the Hugo templates as a reference but not as a source of truth. Staging environments are what testing is for. Don’t sacrifice writing things the right way to comply with the old way.
  • When doing a migration like this on a static site, get just a few pages working before moving the content over to avoid intermediate PRs from breaking. It seems obvious but, with the large amounts of content we had, a lot of things broke when porting over content. Get everything polished with each type of page before moving all your content over.
  • When doing a migration like this, it’s OK to compromise some features of the old design until you determine whether to add them back in, just make sure to test this with real users first. For example, I made the mistake of assuming others wouldn’t mind being without anchor tags. (Note Hugo templates create anchor tags for headers automatically as in Gatsby you have to use MDX to customize markdown components). Test this on a single, popular page with real users first to see if it matters before giving it up.
  • Even for those with React background, the ramp up with GraphQL and setting up Gatsby isn’t as simple as it seems at first. But once you’re set up it’s pretty dang nice.

Overall the process of moving to Gatsby was well worth the effort. As we implement a redesign in React it’s much easier to apply the designs in this cleaner code base. Also though Hugo was already very performant with a nice SEO score, in Gatsby we are able to increase the performance and SEO thanks to the framework’s flexibility.

Lastly, working with the Gatsby team was awesome and they even give free T-shirts for your first PR!

Wednesday, 25 March


Saturday Morning Breakfast Cereal - Like [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Sighing mournfully will also sign you up for NEW PRODUCT RELEASES

Today's News:


Speeding up Linux disk encryption [The Cloudflare Blog]

Speeding up Linux disk encryption

Data encryption at rest is a must-have for any modern Internet company. Many companies, however, don't encrypt their disks, because they fear the potential performance penalty caused by encryption overhead.

Encrypting data at rest is vital for Cloudflare with more than 200 data centres across the world. In this post, we will investigate the performance of disk encryption on Linux and explain how we made it at least two times faster for ourselves and our customers!

Encrypting data at rest

When it comes to encrypting data at rest there are several ways it can be implemented on a modern operating system (OS). Available techniques are tightly coupled with a typical OS storage stack. A simplified version of the storage stack and encryption solutions can be found on the diagram below:

Speeding up Linux disk encryption

On the top of the stack are applications, which read and write data in files (or streams). The file system in the OS kernel keeps track of which blocks of the underlying block device belong to which files and translates these file reads and writes into block reads and writes, however the hardware specifics of the underlying storage device is abstracted away from the filesystem. Finally, the block subsystem actually passes the block reads and writes to the underlying hardware using appropriate device drivers.

The concept of the storage stack is actually similar to the well-known network OSI model, where each layer has a more high-level view of the information and the implementation details of the lower layers are abstracted away from the upper layers. And, similar to the OSI model, one can apply encryption at different layers (think about TLS vs IPsec or a VPN).

For data at rest we can apply encryption either at the block layers (either in hardware or in software) or at the file level (either directly in applications or in the filesystem).

Block vs file encryption

Generally, the higher in the stack we apply encryption, the more flexibility we have. With application level encryption the application maintainers can apply any encryption code they please to any particular data they need. The downside of this approach is they actually have to implement it themselves and encryption in general is not very developer-friendly: one has to know the ins and outs of a specific cryptographic algorithm, properly generate keys, nonces, IVs etc. Additionally, application level encryption does not leverage OS-level caching and Linux page cache in particular: each time the application needs to use the data, it has to either decrypt it again, wasting CPU cycles, or implement its own decrypted “cache”, which introduces more complexity to the code.

File system level encryption makes data encryption transparent to applications, because the file system itself encrypts the data before passing it to the block subsystem, so files are encrypted regardless if the application has crypto support or not. Also, file systems can be configured to encrypt only a particular directory or have different keys for different files. This flexibility, however, comes at a cost of a more complex configuration. File system encryption is also considered less secure than block device encryption as only the contents of the files are encrypted. Files also have associated metadata, like file size, the number of files, the directory tree layout etc., which are still visible to a potential adversary.

Encryption down at the block layer (often referred to as disk encryption or full disk encryption) also makes data encryption transparent to applications and even whole file systems. Unlike file system level encryption it encrypts all data on the disk including file metadata and even free space. It is less flexible though - one can only encrypt the whole disk with a single key, so there is no per-directory, per-file or per-user configuration. From the crypto perspective, not all cryptographic algorithms can be used as the block layer doesn't have a high-level overview of the data anymore, so it needs to process each block independently. Most common algorithms require some sort of block chaining to be secure, so are not applicable to disk encryption. Instead, special modes were developed just for this specific use-case.

So which layer to choose? As always, it depends... Application and file system level encryption are usually the preferred choice for client systems because of the flexibility. For example, each user on a multi-user desktop may want to encrypt their home directory with a key they own and leave some shared directories unencrypted. On the contrary, on server systems, managed by SaaS/PaaS/IaaS companies (including Cloudflare) the preferred choice is configuration simplicity and security - with full disk encryption enabled any data from any application is automatically encrypted with no exceptions or overrides. We believe that all data needs to be protected without sorting it into "important" vs "not important" buckets, so the selective flexibility the upper layers provide is not needed.

Hardware vs software disk encryption

When encrypting data at the block layer it is possible to do it directly in the storage hardware, if the hardware supports it. Doing so usually gives better read/write performance and consumes less resources from the host. However, since most hardware firmware is proprietary, it does not receive as much attention and review from the security community. In the past this led to flaws in some implementations of hardware disk encryption, which render the whole security model useless. Microsoft, for example, started to prefer software-based disk encryption since then.

We didn't want to put our data and our customers' data to the risk of using potentially insecure solutions and we strongly believe in open-source. That's why we rely only on software disk encryption in the Linux kernel, which is open and has been audited by many security professionals across the world.

Linux disk encryption performance

We aim not only to save bandwidth costs for our customers, but to deliver content to Internet users as fast as possible.

At one point we noticed that our disks were not as fast as we would like them to be. Some profiling as well as a quick A/B test pointed to Linux disk encryption. Because not encrypting the data (even if it is supposed-to-be a public Internet cache) is not a sustainable option, we decided to take a closer look into Linux disk encryption performance.

Device mapper and dm-crypt

Linux implements transparent disk encryption via a dm-crypt module and dm-crypt itself is part of device mapper kernel framework. In a nutshell, the device mapper allows pre/post-process IO requests as they travel between the file system and the underlying block device.

dm-crypt in particular encrypts "write" IO requests before sending them further down the stack to the actual block device and decrypts "read" IO requests before sending them up to the file system driver. Simple and easy! Or is it?

Benchmarking setup

For the record, the numbers in this post were obtained by running specified commands on an idle Cloudflare G9 server out of production. However, the setup should be easily reproducible on any modern x86 laptop.

Generally, benchmarking anything around a storage stack is hard because of the noise introduced by the storage hardware itself. Not all disks are created equal, so for the purpose of this post we will use the fastest disks available out there - that is no disks.

Instead Linux has an option to emulate a disk directly in RAM. Since RAM is much faster than any persistent storage, it should introduce little bias in our results.

The following command creates a 4GB ramdisk:

$ sudo modprobe brd rd_nr=1 rd_size=4194304
$ ls /dev/ram0

Now we can set up a dm-crypt instance on top of it thus enabling encryption for the disk. First, we need to generate the disk encryption key, "format" the disk and specify a password to unlock the newly generated key.

$ fallocate -l 2M crypthdr.img
$ sudo cryptsetup luksFormat /dev/ram0 --header crypthdr.img

This will overwrite data on crypthdr.img irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase:
Verify passphrase:

Those who are familiar with LUKS/dm-crypt might have noticed we used a LUKS detached header here. Normally, LUKS stores the password-encrypted disk encryption key on the same disk as the data, but since we want to compare read/write performance between encrypted and unencrypted devices, we might accidentally overwrite the encrypted key during our benchmarking later. Keeping the encrypted key in a separate file avoids this problem for the purposes of this post.

Now, we can actually "unlock" the encrypted device for our testing:

$ sudo cryptsetup open --header crypthdr.img /dev/ram0 encrypted-ram0
Enter passphrase for /dev/ram0:
$ ls /dev/mapper/encrypted-ram0

At this point we can now compare the performance of encrypted vs unencrypted ramdisk: if we read/write data to /dev/ram0, it will be stored in plaintext. Likewise, if we read/write data to /dev/mapper/encrypted-ram0, it will be decrypted/encrypted on the way by dm-crypt and stored in ciphertext.

It's worth noting that we're not creating any file system on top of our block devices to avoid biasing results with a file system overhead.

Measuring throughput

When it comes to storage testing/benchmarking Flexible I/O tester is the usual go-to solution. Let's simulate simple sequential read/write load with 4K block size on the ramdisk without encryption:

$ sudo fio --filename=/dev/ram0 --readwrite=readwrite --bs=4k --direct=1 --loops=1000000 --name=plain
plain: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
Starting 1 process
Run status group 0 (all jobs):
   READ: io=21013MB, aggrb=1126.5MB/s, minb=1126.5MB/s, maxb=1126.5MB/s, mint=18655msec, maxt=18655msec
  WRITE: io=21023MB, aggrb=1126.1MB/s, minb=1126.1MB/s, maxb=1126.1MB/s, mint=18655msec, maxt=18655msec

Disk stats (read/write):
  ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

The above command will run for a long time, so we just stop it after a while. As we can see from the stats, we're able to read and write roughly with the same throughput around 1126 MB/s. Let's repeat the test with the encrypted ramdisk:

$ sudo fio --filename=/dev/mapper/encrypted-ram0 --readwrite=readwrite --bs=4k --direct=1 --loops=1000000 --name=crypt
crypt: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
Starting 1 process
Run status group 0 (all jobs):
   READ: io=1693.7MB, aggrb=150874KB/s, minb=150874KB/s, maxb=150874KB/s, mint=11491msec, maxt=11491msec
  WRITE: io=1696.4MB, aggrb=151170KB/s, minb=151170KB/s, maxb=151170KB/s, mint=11491msec, maxt=11491msec

Whoa, that's a drop! We only get ~147 MB/s now, which is more than 7 times slower! And this is on a totally idle machine!

Maybe, crypto is just slow

The first thing we considered is to ensure we use the fastest crypto. cryptsetup allows us to benchmark all the available crypto implementations on the system to select the best one:

$ sudo cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1340890 iterations per second for 256-bit key
PBKDF2-sha256    1539759 iterations per second for 256-bit key
PBKDF2-sha512    1205259 iterations per second for 256-bit key
PBKDF2-ripemd160  967321 iterations per second for 256-bit key
PBKDF2-whirlpool  720175 iterations per second for 256-bit key
#  Algorithm | Key |  Encryption |  Decryption
     aes-cbc   128b   969.7 MiB/s  3110.0 MiB/s
 serpent-cbc   128b           N/A           N/A
 twofish-cbc   128b           N/A           N/A
     aes-cbc   256b   756.1 MiB/s  2474.7 MiB/s
 serpent-cbc   256b           N/A           N/A
 twofish-cbc   256b           N/A           N/A
     aes-xts   256b  1823.1 MiB/s  1900.3 MiB/s
 serpent-xts   256b           N/A           N/A
 twofish-xts   256b           N/A           N/A
     aes-xts   512b  1724.4 MiB/s  1765.8 MiB/s
 serpent-xts   512b           N/A           N/A
 twofish-xts   512b           N/A           N/A

It seems aes-xts with a 256-bit data encryption key is the fastest here. But which one are we actually using for our encrypted ramdisk?

$ sudo dmsetup table /dev/mapper/encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0

We do use aes-xts with a 256-bit data encryption key (count all the zeroes conveniently masked by dmsetup tool - if you want to see the actual bytes, add the --showkeys option to the above command). The numbers do not add up however: cryptsetup benchmark tells us above not to rely on the results, as "Tests are approximate using memory only (no storage IO)", but that is exactly how we've set up our experiment using the ramdisk. In a somewhat worse case (assuming we're reading all the data and then encrypting/decrypting it sequentially with no parallelism) doing back-of-the-envelope calculation we should be getting around (1126 * 1823) / (1126 + 1823) =~696 MB/s, which is still quite far from the actual 147 * 2 = 294 MB/s (total for reads and writes).

dm-crypt performance flags

While reading the cryptsetup man page we noticed that it has two options prefixed with --perf-, which are probably related to performance tuning. The first one is --perf-same_cpu_crypt with a rather cryptic description:

Perform encryption using the same cpu that IO was submitted on.  The default is to use an unbound workqueue so that encryption work is automatically balanced between available CPUs.  This option is only relevant for open action.

So we enable the option

$ sudo cryptsetup close encrypted-ram0
$ sudo cryptsetup open --header crypthdr.img --perf-same_cpu_crypt /dev/ram0 encrypted-ram0

Note: according to the latest man page there is also a cryptsetup refresh command, which can be used to enable these options live without having to "close" and "re-open" the encrypted device. Our cryptsetup however didn't support it yet.

Verifying if the option has been really enabled:

$ sudo dmsetup table encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0 1 same_cpu_crypt

Yes, we can now see same_cpu_crypt in the output, which is what we wanted. Let's rerun the benchmark:

$ sudo fio --filename=/dev/mapper/encrypted-ram0 --readwrite=readwrite --bs=4k --direct=1 --loops=1000000 --name=crypt
crypt: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
Starting 1 process
Run status group 0 (all jobs):
   READ: io=1596.6MB, aggrb=139811KB/s, minb=139811KB/s, maxb=139811KB/s, mint=11693msec, maxt=11693msec
  WRITE: io=1600.9MB, aggrb=140192KB/s, minb=140192KB/s, maxb=140192KB/s, mint=11693msec, maxt=11693msec

Hmm, now it is ~136 MB/s which is slightly worse than before, so no good. What about the second option --perf-submit_from_crypt_cpus:

Disable offloading writes to a separate thread after encryption.  There are some situations where offloading write bios from the encryption threads to a single thread degrades performance significantly.  The default is to offload write bios to the same thread.  This option is only relevant for open action.

Maybe, we are in the "some situation" here, so let's try it out:

$ sudo cryptsetup close encrypted-ram0
$ sudo cryptsetup open --header crypthdr.img --perf-submit_from_crypt_cpus /dev/ram0 encrypted-ram0
Enter passphrase for /dev/ram0:
$ sudo dmsetup table encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0 1 submit_from_crypt_cpus

And now the benchmark:

$ sudo fio --filename=/dev/mapper/encrypted-ram0 --readwrite=readwrite --bs=4k --direct=1 --loops=1000000 --name=crypt
crypt: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
Starting 1 process
Run status group 0 (all jobs):
   READ: io=2066.6MB, aggrb=169835KB/s, minb=169835KB/s, maxb=169835KB/s, mint=12457msec, maxt=12457msec
  WRITE: io=2067.7MB, aggrb=169965KB/s, minb=169965KB/s, maxb=169965KB/s, mint=12457msec, maxt=12457msec

~166 MB/s, which is a bit better, but still not good...

Asking the community

Being desperate we decided to seek support from the Internet and posted our findings to the dm-crypt mailing list, but the response we got was not very encouraging:

If the numbers disturb you, then this is from lack of understanding on your side. You are probably unaware that encryption is a heavy-weight operation...

We decided to make a scientific research on this topic by typing "is encryption expensive" into Google Search and one of the top results, which actually contains meaningful measurements, is... our own post about cost of encryption, but in the context of TLS! This is a fascinating read on its own, but the gist is: modern crypto on modern hardware is very cheap even at Cloudflare scale (doing millions of encrypted HTTP requests per second). In fact, it is so cheap that Cloudflare was the first provider to offer free SSL/TLS for everyone.

Digging into the source code

When trying to use the custom dm-crypt options described above we were curious why they exist in the first place and what is that "offloading" all about. Originally we expected dm-crypt to be a simple "proxy", which just encrypts/decrypts data as it flows through the stack. Turns out dm-crypt does more than just encrypting memory buffers and a (simplified) IO traverse path diagram is presented below:

Speeding up Linux disk encryption

When the file system issues a write request, dm-crypt does not process it immediately - instead it puts it into a workqueue named "kcryptd". In a nutshell, a kernel workqueue just schedules some work (encryption in this case) to be performed at some later time, when it is more convenient. When "the time" comes, dm-crypt sends the request to Linux Crypto API for actual encryption. However, modern Linux Crypto API is asynchronous as well, so depending on which particular implementation your system will use, most likely it will not be processed immediately, but queued again for "later time". When Linux Crypto API will finally do the encryption, dm-crypt may try to sort pending write requests by putting each request into a red-black tree. Then a separate kernel thread again at "some time later" actually takes all IO requests in the tree and sends them down the stack.

Now for read requests: this time we need to get the encrypted data first from the hardware, but dm-crypt does not just ask for the driver for the data, but queues the request into a different workqueue named "kcryptd_io". At some point later, when we actually have the encrypted data, we schedule it for decryption using the now familiar "kcryptd" workqueue. "kcryptd" will send the request to Linux Crypto API, which may decrypt the data asynchronously as well.

To be fair the request does not always traverse all these queues, but the important part here is that write requests may be queued up to 4 times in dm-crypt and read requests up to 3 times. At this point we were wondering if all this extra queueing can cause any performance issues. For example, there is a nice presentation from Google about the relationship between queueing and tail latency. One key takeaway from the presentation is:

A significant amount of tail latency is due to queueing effects

So, why are all these queues there and can we remove them?

Git archeology

No-one writes more complex code just for fun, especially for the OS kernel. So all these queues must have been put there for a reason. Luckily, the Linux kernel source is managed by git, so we can try to retrace the changes and the decisions around them.

The "kcryptd" workqueue was in the source since the beginning of the available history with the following comment:

Needed because it would be very unwise to do decryption in an interrupt context, so bios returning from read requests get queued here.

So it was for reads only, but even then - why do we care if it is interrupt context or not, if Linux Crypto API will likely use a dedicated thread/queue for encryption anyway? Well, back in 2005 Crypto API was not asynchronous, so this made perfect sense.

In 2006 dm-crypt started to use the "kcryptd" workqueue not only for encryption, but for submitting IO requests:

This patch is designed to help dm-crypt comply with the new constraints imposed by the following patch in -mm: md-dm-reduce-stack-usage-with-stacked-block-devices.patch

It seems the goal here was not to add more concurrency, but rather reduce kernel stack usage, which makes sense again as the kernel has a common stack across all the code, so it is a quite limited resource. It is worth noting, however, that the Linux kernel stack has been expanded in 2014 for x86 platforms, so this might not be a problem anymore.

A first version of "kcryptd_io" workqueue was added in 2007 with the intent to avoid:

starvation caused by many requests waiting for memory allocation...

The request processing was bottlenecking on a single workqueue here, so the solution was to add another one. Makes sense.

We are definitely not the first ones experiencing performance degradation because of extensive queueing: in 2011 a change was introduced to conditionally revert some of the queueing for read requests:

If there is enough memory, code can directly submit bio instead queuing this operation in a separate thread.

Unfortunately, at that time Linux kernel commit messages were not as verbose as today, so there is no performance data available.

In 2015 dm-crypt started to sort writes in a separate "dmcrypt_write" thread before sending them down the stack:

On a multiprocessor machine, encryption requests finish in a different order than they were submitted. Consequently, write requests would be submitted in a different order and it could cause severe performance degradation.

It does make sense as sequential disk access used to be much faster than the random one and dm-crypt was breaking the pattern. But this mostly applies to spinning disks, which were still dominant in 2015. It may not be as important with modern fast SSDs (including NVME SSDs).

Another part of the commit message is worth mentioning: particular it enables IO schedulers like CFQ to sort more effectively...

It mentions the performance benefits for the CFQ IO scheduler, but Linux schedulers have improved since then to the point that CFQ scheduler has been removed from the kernel in 2018.

The same patchset replaces the sorting list with a red-black tree:

In theory the sorting should be performed by the underlying disk scheduler, however, in practice the disk scheduler only accepts and sorts a finite number of requests. To allow the sorting of all requests, dm-crypt needs to implement its own sorting.

The overhead associated with rbtree-based sorting is considered negligible so it is not used conditionally.

All that make sense, but it would be nice to have some backing data.

Interestingly, in the same patchset we see the introduction of our familiar "submit_from_crypt_cpus" option:

There are some situations where offloading write bios from the encryption threads to a single thread degrades performance significantly

Overall, we can see that every change was reasonable and needed, however things have changed since then:

  • hardware became faster and smarter
  • Linux resource allocation was revisited
  • coupled Linux subsystems were rearchitected

And many of the design choices above may not be applicable to modern Linux.

The "clean-up"

Based on the research above we decided to try to remove all the extra queueing and asynchronous behaviour and revert dm-crypt to its original purpose: simply encrypt/decrypt IO requests as they pass through. But for the sake of stability and further benchmarking we ended up not removing the actual code, but rather adding yet another dm-crypt option, which bypasses all the queues/threads, if enabled. The flag allows us to switch between the current and new behaviour at runtime under full production load, so we can easily revert our changes should we see any side-effects. The resulting patch can be found on the Cloudflare GitHub Linux repository.

Synchronous Linux Crypto API

From the diagram above we remember that not all queueing is implemented in dm-crypt. Modern Linux Crypto API may also be asynchronous and for the sake of this experiment we want to eliminate queues there as well. What does "may be" mean, though? The OS may contain different implementations of the same algorithm (for example, hardware-accelerated AES-NI on x86 platforms and generic C-code AES implementations). By default the system chooses the "best" one based on the configured algorithm priority. dm-crypt allows overriding this behaviour and request a particular cipher implementation using the capi: prefix. However, there is one problem. Let us actually check the available AES-XTS (this is our disk encryption cipher, remember?) implementations on our system:

$ grep -A 11 'xts(aes)' /proc/crypto
name         : xts(aes)
driver       : xts(ecb(aes-generic))
module       : kernel
priority     : 100
refcnt       : 7
selftest     : passed
internal     : no
type         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64
name         : __xts(aes)
driver       : cryptd(__xts-aes-aesni)
module       : cryptd
priority     : 451
refcnt       : 1
selftest     : passed
internal     : yes
type         : skcipher
async        : yes
blocksize    : 16
min keysize  : 32
max keysize  : 64
name         : xts(aes)
driver       : xts-aes-aesni
module       : aesni_intel
priority     : 401
refcnt       : 1
selftest     : passed
internal     : no
type         : skcipher
async        : yes
blocksize    : 16
min keysize  : 32
max keysize  : 64
name         : __xts(aes)
driver       : __xts-aes-aesni
module       : aesni_intel
priority     : 401
refcnt       : 7
selftest     : passed
internal     : yes
type         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64

We want to explicitly select a synchronous cipher from the above list to avoid queueing effects in threads, but the only two supported are xts(ecb(aes-generic)) (the generic C implementation) and __xts-aes-aesni (the x86 hardware-accelerated implementation). We definitely want the latter as it is much faster (we're aiming for performance here), but it is suspiciously marked as internal (see internal: yes). If we check the source code:

Mark a cipher as a service implementation only usable by another cipher and never by a normal user of the kernel crypto API

So this cipher is meant to be used only by other wrapper code in the Crypto API and not outside it. In practice this means, that the caller of the Crypto API needs to explicitly specify this flag, when requesting a particular cipher implementation, but dm-crypt does not do it, because by design it is not part of the Linux Crypto API, rather an "external" user. We already patch the dm-crypt module, so we could as well just add the relevant flag. However, there is another problem with AES-NI in particular: x86 FPU. "Floating point" you say? Why do we need floating point math to do symmetric encryption which should only be about bit shifts and XOR operations? We don't need the math, but AES-NI instructions use some of the CPU registers, which are dedicated to the FPU. Unfortunately the Linux kernel does not always preserve these registers in interrupt context for performance reasons (saving/restoring FPU is expensive). But dm-crypt may execute code in interrupt context, so we risk corrupting some other process data and we go back to "it would be very unwise to do decryption in an interrupt context" statement in the original code.

Our solution to address the above was to create another somewhat "smart" Crypto API module. This module is synchronous and does not roll its own crypto, but is just a "router" of encryption requests:

  • if we can use the FPU (and thus AES-NI) in the current execution context, we just forward the encryption request to the faster, "internal" __xts-aes-aesni implementation (and we can use it here, because now we are part of the Crypto API)
  • otherwise, we just forward the encryption request to the slower, generic C-based xts(ecb(aes-generic)) implementation

Using the whole lot

Let's walk through the process of using it all together. The first step is to grab the patches and recompile the kernel (or just compile dm-crypt and our xtsproxy modules).

Next, let's restart our IO workload in a separate terminal, so we can make sure we can reconfigure the kernel at runtime under load:

$ sudo fio --filename=/dev/mapper/encrypted-ram0 --readwrite=readwrite --bs=4k --direct=1 --loops=1000000 --name=crypt
crypt: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
Starting 1 process

In the main terminal make sure our new Crypto API module is loaded and available:

$ sudo modprobe xtsproxy
$ grep -A 11 'xtsproxy' /proc/crypto
driver       : xts-aes-xtsproxy
module       : xtsproxy
priority     : 0
refcnt       : 0
selftest     : passed
internal     : no
type         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64
ivsize       : 16
chunksize    : 16

Reconfigure the encrypted disk to use our newly loaded module and enable our patched dm-crypt flag (we have to use low-level dmsetup tool and cryptsetup obviously is not aware of our modifications):

$ sudo dmsetup table encrypted-ram0 --showkeys | sed 's/aes-xts-plain64/capi:xts-aes-xtsproxy-plain64/' | sed 's/$/ 1 force_inline/' | sudo dmsetup reload encrypted-ram0

We just "loaded" the new configuration, but for it to take effect, we need to suspend/resume the encrypted device:

$ sudo dmsetup suspend encrypted-ram0 && sudo dmsetup resume encrypted-ram0

And now observe the result. We may go back to the other terminal running the fio job and look at the output, but to make things nicer, here's a snapshot of the observed read/write throughput in Grafana:

Speeding up Linux disk encryption
Speeding up Linux disk encryption

Wow, we have more than doubled the throughput! With the total throughput of ~640 MB/s we're now much closer to the expected ~696 MB/s from above. What about the IO latency? (The await statistic from the iostat reporting tool):

Speeding up Linux disk encryption

The latency has been cut in half as well!

To production

So far we have been using a synthetic setup with some parts of the full production stack missing, like file systems, real hardware and most importantly, production workload. To ensure we’re not optimising imaginary things, here is a snapshot of the production impact these changes bring to the caching part of our stack:

Speeding up Linux disk encryption

This graph represents a three-way comparison of the worst-case response times (99th percentile) for a cache hit in one of our servers. The green line is from a server with unencrypted disks, which we will use as baseline. The red line is from a server with encrypted disks with the default Linux disk encryption implementation and the blue line is from a server with encrypted disks and our optimisations enabled. As we can see the default Linux disk encryption implementation has a significant impact on our cache latency in worst case scenarios, whereas the patched implementation is indistinguishable from not using encryption at all. In other words the improved encryption implementation does not have any impact at all on our cache response speed, so we basically get it for free! That’s a win!

We're just getting started

This post shows how an architecture review can double the performance of a system. Also we reconfirmed that modern cryptography is not expensive and there is usually no excuse not to protect your data.

We are going to submit this work for inclusion in the main kernel source tree, but most likely not in its current form. Although the results look encouraging we have to remember that Linux is a highly portable operating system: it runs on powerful servers as well as small resource constrained IoT devices and on many other CPU architectures as well. The current version of the patches just optimises disk encryption for a particular workload on a particular architecture, but Linux needs a solution which runs smoothly everywhere.

That said, if you think your case is similar and you want to take advantage of the performance improvements now, you may grab the patches and hopefully provide feedback. The runtime flag makes it easy to toggle the functionality on the fly and a simple A/B test may be performed to see if it benefits any particular case or setup. These patches have been running across our wide network of more than 200 data centres on five generations of hardware, so can be reasonably considered stable. Enjoy both performance and security from Cloudflare for all!

Speeding up Linux disk encryption

Tuesday, 24 March


Saturday Morning Breakfast Cereal - Toilet Paper [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

The fun question is how long before this comic becomes topical again?

Today's News:


The Bandwidth Alliance Charges Forward with New Partners - Alibaba, Zenlayer, and Cherry Servers [The Cloudflare Blog]

The Bandwidth Alliance Charges Forward with New Partners - Alibaba, Zenlayer, and Cherry Servers

We started the Bandwidth Alliance in 2018 with a group of like-minded cloud and networking partners. Our common goal was to help our mutual customers reduce or eliminate data transfer charges, sometimes known as "bandwidth” or “egress” fees, between the cloud and the consumer. By reducing or eliminating these costs, our customers can more easily choose a best of breed set of solutions because they don’t have to worry about data charges from moving workloads between vendors, and thereby becoming locked-in to a single provider for all their needs. Today we’re announcing an important milestone: the addition of Alibaba, Zenlayer, and Cherry Servers to the Bandwidth Alliance, expanding it to a total of 20 partners. These partners offer our customers a wide choice of cloud services and products each suited to different needs.

In addition, we are working with our existing partners including Microsoft Azure, Digital Ocean and several others to onboard customers and provide them the benefits of the Bandwidth Alliance. Contact us at if you are interested.

Customer savings  

Over the past year we have seen several customers take advantage of the Bandwidth Alliance and wanted to highlight two examples.

Nodecraft, which allows users an easy way to set up their own game servers, is a perfect example of how the Bandwidth Alliance helped to cut down egress costs. Nodecraft supports games like Minecraft, ARK: Survival Evolved, and Counter-Strike. As Nodecraft’s popularity increased, so did their AWS bill. They were not only being charged for storage they were using, but also for ‘egress’ or data transfer fees out of AWS. They made the decision to move their storage to Backblaze. Now they use Backblaze’s B2 Storage and Cloudflare’s extensive network to deliver content to customers without any egress charges. Read more about their journey here and here. provides simple and smart solutions for podcasting, including hosting, analytics, and ads. The most important of Pippa’s business is rapid and reliable asset delivery. As Pippa grew, they were pushing millions of large audio files to listeners worldwide. This resulted in significantly increased costs, including excessive data egress fees for retrieving data from a cloud storage service such as AWS S3. DigitalOcean waives egress fees to transfer data to Cloudflare, effectively creating a zero-cost data bridge from DigitalOcean to Cloudflare’s global network. Pippa moved to DigitalOcean storage and Cloudflare’s global cloud security and delivery network. With the combination of lower cost storage and zero egress fees, Pippa saw a 50% savings on their cloud bill.

What could your savings be?

Nodecraft and Pippa are just two examples of small businesses who are seeing significant cost savings from the Bandwidth Alliance. They both chose the best storage and cloud solution for their use case and a global cloud network by Cloudflare without any taxation of transferring data between these two products. With our newly added partners we expect many more customers to benefit.

You may be asking - ‘How much can I save?’ To help you get a sense of the scale of your potential savings, by moving to the Bandwidth Alliance, we have put together a calculator. Fill in your details on egress to figure out how much you could be saving. We hope this is a helpful resource as you evaluate your cloud platform choices and spend.


Why We Started Putting Unpopular Assets in Memory [The Cloudflare Blog]

Why We Started Putting Unpopular Assets in Memory
Why We Started Putting Unpopular Assets in Memory

Part of Cloudflare's service is a CDN that makes millions of Internet properties faster and more reliable by caching web assets closer to browsers and end users.

We make improvements to our infrastructure to make end-user experiences faster, more secure, and more reliable all the time. Here’s a case study of one such engineering effort where something counterintuitive turned out to be the right approach.

Our storage layer, which serves millions of cache hits per second globally, is powered by high IOPS NVMe SSDs.

Although SSDs are fast and reliable, cache hit tail latency within our system is dominated by the IO capacity of our SSDs. Moreover, because flash memory chips wear out, a non-negligible portion of our operational cost, including the cost of new devices, shipment, labor and downtime, is spent on replacing dead SSDs.

Recently, we developed a technology that reduces our hit tail latency and reduces the wear out of SSDs. This technology is a memory-SSD hybrid storage system that puts unpopular assets in memory.

The end result: cache hits from our infrastructure are now faster for all customers.

You may have thought that was a typo in my explanation of how the technique works. In fact, a few colleagues thought the same when we proposed this project internally, “I think you meant to say ‘popular assets’ in your document”. Intuitively, we should only put popular assets in memory since memory is even faster than SSDs.

You are not wrong. But I’m also correct. This blog post explains why.

First let me explain how we already use memory to speed up the IO of popular assets.

Page cache

Since it is so obvious that memory speeds up popular assets, Linux already does much of the hard work for us. This functionality is called the “page cache”. Files in Linux systems are organized into pages internally, hence the name.

The page cache uses the system’s available memory to cache reads from and buffer writes to durable storage. During typical operations of a Cloudflare server, all the services and processes themselves do not consume all physical memory. The remaining memory is used by the page cache.

Below shows the memory layout of a typical Cloudflare edge server with 256GB of physical memory.

Why We Started Putting Unpopular Assets in Memory

We can see the used memory (RSS) on top. RSS memory consumption for all services is about 87.71 GB, with our cache service consuming 4.1 GB. At the bottom we see overall page cache usage. Total size is 128.6 GB, with the cache service using 41.6 GB of it to serve cached web assets.

Caching reads

Pages that are frequently used are cached in memory to speed up reads. Linux uses the least recently used (LRU) algorithm among other techniques to decide what to keep in page cache. In other words, popular assets are already in memory.

Buffering writes

The page cache also buffers writes. Writes are not synchronized to disk right away. They are buffered until the page cache decides to do so at a later time, either periodically or due to memory pressure. Therefore, it also reduces repetitive writes to the same file.

However, because of the nature of the caching workload, a web asset arriving at our cache system is usually static. Unlike workloads of databases, where a single value can get repeatedly updated, in our caching workload, there are very few repetitive updates to an asset before it is completely cached. Therefore, the page cache does not help in reducing writes to disk since all pages will eventually be written to disks.

Smarter than the page cache?

Although the Linux page cache works great out of the box, our caching system can still outsmart it since our system knows more about the context of the data being accessed, such as the content type, access frequency and the time-to-live value.

Instead of improving caching popular assets, which the page cache already does, in the next section, we will focus on something the page cache cannot do well: putting unpopular assets in memory.

Putting unpopular assets in memory

Motivation: Writes are bad

Worn out SSDs are solely caused by writes, not reads. As mentioned above, it is costly to replace SSDs. Since Cloudflare operates data centers in 200 cities across the world, the cost of shipment and replacement can be a significant proportion of the cost of the SSD itself.

Moreover, write operations on SSDs slow down read operations. This is because SSD writes require program/erase (P/E) operations issued to the storage chips. These operations block reads to the same chips. In short, the more writes, the slower the reads. Since the page cache already cached popular assets, this effect has the highest impact on our cache hit tail latency. Previously we talked about how we dramatically reduced the impact of such latency. In the meantime, directly reducing the latency itself will also be very helpful.

Motivation: Many assets are “one-hit-wonders”

One-hit-wonders refer to cached assets that are never accessed; they get accessed once and then cached yet never read again. In addition, one-hit-wonders also include cached assets that are evicted before they are used once due to their lack of popularity. The performance of websites is not harmed even if we don’t cache such assets in the first place.

To quantify how many of our assets are one-hit-wonders and to test the hypothesis that reducing disk writes can speed up disk reads, we conducted a simple experiment. The experiment: using a representative sample of traffic, we modified our caching logic to only cache an asset the second time our server encountered it.

Why We Started Putting Unpopular Assets in Memory
Why We Started Putting Unpopular Assets in Memory

The red line indicates when the experiment started. The green line represents the experimental group and the yellow line represents the control group.

The result: disk writes per second were reduced by roughly half and corresponding disk hit tail latency was reduced by approximately five percent.

This experiment demonstrated that if we don’t cache one-hit-wonders, our disks last longer and our cache hits are faster.

One more benefit of not caching one-hit-wonders, which was not immediately obvious from the experiment, is that, it increases the effective capacity of cache because of the reduced competing pressure from the removal of the unpopular assets. This in turn increases the cache hit ratio and cache retention.

The next question: how can we replicate these results, at scale, in production, without impacting customer experience negatively?

Potential implementation approaches

Remember but don’t cache

One smart way to eliminate one-hit-wonders from cache is to not cache an asset during its first few misses but to remember its appearances. When the cache system encounters the same asset repeatedly over a certain number of times, the system will start to cache the asset. This is basically what we did in the experiment above.

In order to remember the appearances, beside hash tables, many memory efficient data structures, such as Bloom filter, Counting Bloom filter and Count-min sketch, can be used with their specific trade-offs.

Such an approach solves one problem but introduces a new one: every asset is missed at least twice before it is cached. Because the number of cache misses is amplified compared to what we do today, this approach multiplies both the cost of bandwidth and server utilization for our customers. This is not an acceptable tradeoff in our eyes.

Transient cache in memory

A better idea we came up with: put every asset we want to cache to disk in memory first. The memory is called a “transient cache”. Assets in transient cache are promoted to the permanent cache, backed by SSDs, after they are accessed a certain number of times, indicating they are popular enough to be stored persistently. If they are not promoted, they are eventually evicted from transient cache because of lack of popularity.

Why We Started Putting Unpopular Assets in Memory

This design makes sure that disk writes are only spent on assets that are popular, while ensuring every asset is only “missed” once.

The transient cache system on the real world Internet

We implemented this idea and deployed it to our production caching systems. Here are some learnings and data points we would like to share.


Our transient cache technology does not pull a performance improvement rabbit out of a hat. It achieves improved performance for our customers by consuming other resources in our system. These resource consumption trade-offs must be carefully considered when deploying to systems at the scale at which Cloudflare operates.

Transient cache memory footprint

The amount of memory allocated for our transient cache matters. Cache size directly dictates how long a given asset will live in cache before it is evicted. If the transient cache is too small, new assets will evict old assets before the old assets receive hits that promote them to disk. From a customer's perspective, cache retention being too short is unacceptable because it leads to more misses, commensurate higher cost, and worse eyeball performance.

Competition with page cache

As mentioned before, the page cache uses the system’s available memory. The more memory we allocate to the transient cache for unpopular assets, the less memory we have for popular assets in the page cache. Finding the sweet spot for the trade-off between these two depends on traffic volume, usage patterns, and the specific hardware configuration our software is running on.

Competition with process memory usage

Another competitor to transient cache is regular memory used by our services and processes. Unlike the page cache, which is used opportunistically, process memory usage is a hard requirement for the operation of our software. An additional wrinkle is that some of our edge services increase their total RSS memory consumption when performing a “zero downtime upgrade”, which runs both the old and new version of the service in parallel for a short period of time. From our experience, if there is not enough physical memory to do so, the system’s overall performance will degrade to an unacceptable level due to increased IO pressure from reduced page cache space.

In production

Given the considerations above, we enabled this technology conservatively to start. The new hybrid storage system is used only by our newer generations of servers that have more physical memory than older generations. Additionally, the system is only used on a subset of assets. By tuning the size of the transient cache and what percentage of requests use it at all, we are able to explore the sweet spots for the trade-offs between performance, cache retention, and memory resource consumption.

The chart below shows the IO usage of an enabled cohort before and after (red line) enabling transient cache.

Why We Started Putting Unpopular Assets in Memory

We can see that disk write (in bytes per second) was reduced by 25% at peak and 20% off peak. Although it is too early to tell, the life-span of those SSDs should be extended proportionally.

More importantly for our customers: our CDN cache hit tail latency is measurably decreased!

Why We Started Putting Unpopular Assets in Memory

Future work

We just made the first step towards a smart memory-SSD hybrid storage system. There is still a lot to be done.

Broader deployment

Currently, we only apply this technology to specific hardware generations and a portion of traffic. When we decommission our older generations of servers and replace them with newer generations with more physical memory, we will be able to apply the transient cache to a larger portion of traffic. Per our preliminary experiments, in certain data centers, applying the transient cache to all traffic is able to reduce disk writes up to 70% and decrease end user visible tail latency by up to 20% at peak hours.

Smarter promotion algorithms

Our implemented promotion strategy for moving assets from transient cache to durable storage is based on a simple heuristic: the number of hits. Other information can be used to make a better promotion decision. For example, the TTL (time to live) of an asset can be taken into account as well. One such strategy is to refuse to promote an asset if the TTL of it has only a few seconds left, which will further reduce unnecessary disk writes.

Another aspect of the algorithms is demotion. We can actively or passively (during eviction) demote assets from persistent cache to transient cache based on the criteria mentioned above. The demotion itself does not directly reduce writes to persistent cache but it could help cache hit ratio and cache retention if done smartly.

Transient cache on other types of storage

We choose memory to host the transient cache because its wear out cost is low (none). This is also the case for HDDs. It is possible to build a hybrid storage system across memory, SSDs and HDDs to find the right balance between their costs and performance characteristics.


By putting unpopular assets in memory, we are able to trade-off between memory usage, tail hit latency and SSD lifetimes. With our current configuration, we are able to extend SSD life while reducing tail hit latency, at the expense of available system memory. Transient cache also opens possibilities for future heuristic and storage hierarchy improvements.

Whether the techniques described in this post benefit your system depends on the workload and the hardware resource constraints of your system. Hopefully us sharing this counterintuitive idea can inspire other novel ways of building faster and more efficient systems.

And finally, if doing systems work like this sounds interesting to you, come work at Cloudflare!

Monday, 23 March


Saturday Morning Breakfast Cereal - Trust [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

I don't know about any other couples, but the odds of me or my wife successfully doing a trust game are negative.

Today's News:


Deploying security.txt: how Cloudflare’s security team builds on Workers [The Cloudflare Blog]

Deploying security.txt: how Cloudflare’s security team builds on Workers
Deploying security.txt: how Cloudflare’s security team builds on Workers

When the security team at Cloudflare takes on new projects, we approach them with the goal of achieving the “builder first mindset” whereby we design, develop, and deploy solutions just as any standard engineering team would. Additionally, we aim to dogfood our products wherever possible. Cloudflare as a security platform offers a lot of functionality that is vitally important to us, including, but not limited to, our WAF, Workers platform, and Cloudflare Access. We get a lot of value out of using Cloudflare to secure Cloudflare. Not only does this allow us to test the security of our products; it provides us an avenue of direct feedback to help improve the roadmaps for engineering projects.

One specific product that we get a lot of use out of is our serverless platform, Cloudflare Workers. With it, we can have incredible flexibility in the types of applications that we are able to build and deploy to our edge. An added bonus here is that our team does not have to manage a single server that our code runs on.

Today, we’re launching support for the security.txt initiative through Workers to help give security researchers a common location to learn about how to communicate with our team. In this post, I’m going to focus on some of the benefits of managing security projects through Workers, detail how we built out the mechanism that we’re using to manage security.txt on, and highlight some other use cases that our team has built on Workers in the past few months.

Building on Workers

The Workers platform is designed to let anyone deploy code to our edge network which currently spans across 200 cities in 90 countries. As a result of this, applications deployed on the platform perform exceptionally well, while maintaining high reliability. You can implement nearly any logic imaginable on top of zones in your Cloudflare account with almost no effort.

One of the biggest immediate benefits of this is the idea that there are no servers for our team to maintain. Far too often I have seen situations where a long running application is deployed onto an instanced machine within a cloud environment and immediately forgotten about. In some cases, outdated software or poorly provisioned permissions can open up dangerous vectors for malicious actors if a compromise is occurring. Our team can deploy new projects with the confidence that they will remain secure. The Workers runtime environment has many wins such as the fact that it receives automatic security updates and that we don't need to maintain the software stack, just the application logic.

Another benefit is that since Workers executes JavaScript, we can dream up complex applications and rapidly deploy them as full applications in production without a huge investment in engineering time. This encourages rapid experimentation and iteration while achieving high impact with consistent performance. A couple of examples:

  • Secure code review - On every pull request made to a code repository, a Worker runs a set of security curated rules and posts comments if there are matches.
  • CSP nonces and HTML rewriting - A Worker generates random nonces and uses the HTMLRewriter API to mutate responses with dynamic content security policies on our dashboard.
  • Authentication on legacy applications -  Using the Web Crypto API, a Worker sits in front of legacy origins and issues, validates, and signs JWTs to control access.

Stay tuned for future blog posts where we plan to dive deeper on these initiatives and more!


Our team regularly engages with external security researchers through our HackerOne program and since about 2014, the details of our program can be found on a dedicated page on our marketing site. This has worked quite well in starting up our program and allows anyone to contact our team if they have a vulnerability to report. However, every so often we’d see that there were issues with this approach. Specifically, there were cases of vulnerabilities being submitted to our support staff who would then tell the reporter about the disclosure page which then directs them to HackerOne resulting in an overall error prone experience for everyone involved. Other times, the team that manages our social media would direct researchers to us.

Security researchers, having done the hard work to find a vulnerability, face challenges to responsibly disclose it. They need to be able to quickly locate contact information, consume it and disclose without delay. The security.txt initiative addresses this problem by defining a common location and format for organizations to provide this information to researchers. A specification for this was submitted to the IETF with the description:

“When security vulnerabilities are discovered by researchers, proper reporting channels are often lacking.  As a result, vulnerabilities may be left unreported.  This document defines a format ("security.txt") to help organizations describe their vulnerability disclosure practices to make it easier for researchers to report vulnerabilities.”

Employing the guidance in security.txt doesn't solve the problem entirely but it is becoming common best practice amongst many other security conscious companies. Over time I expect that researchers will rely on security.txt for information gathering. In fact, some integrations are already being built by the community into security tools to automatically fetch information from a company’s security.txt!

Our security.txt can be found here:

security.txt as a service -- built on Cloudflare Workers

Deploying security.txt: how Cloudflare’s security team builds on Workers
Cloudflare's security.txt

When scoping out the work for deploying security.txt we quickly realized that there were a few areas we wanted to address when deploying the service:

  • Automation - Deploys should involve as few humans as possible.
  • Ease of maintenance - Necessary changes (specification updates, key rotations, etc) should be a single commit and deploy away.
  • Version control - security.txt is inherently a sensitive file, retaining attribution of every change made is valuable from an auditing perspective.

Certainly, we could manage this project through a manual process involving extensive documentation and coordination for manual deployments but we felt it was important to build this out as a full service that was easily maintainable. Manual maintenance and deployments do not necessarily scale well over time, especially for something that isn’t touched that often. We wanted to make that process as easy as possible since security.txt is meant to be a regularly maintained, living document.

We quickly turned to Workers for the task. Using the Wrangler CLI we can achieve the ease of deployment and automation requirements as well as track the project in git. In under half an hour I was able to build out a full prototype that addressed all of the requirements of the Internet-Draft and deploy it to a staging instance of our website. Since then, members from our team made some revisions to the text itself, updated the expiration on our PGP key, and bumped up to the latest draft version.

One cool decision we made is that the security.txt file is created from a template at build time which allows us to perform dynamic operations on our baseline security.txt. For example, Section 3.5.5 of the draft calls for a date and time after which the data contained in the "security.txt" file is considered stale and should not be used. To address this field, we wrote a short node.js script that automatically sets an expiration of 365 days after the point of deployment, encouraging regular updates:

const dayjs = require('dayjs')
const fs = require('fs')

const main = async () => {
    `\nExpires: ${dayjs()
      .add(365, 'day')
      // Thu, 31 Dec 2020 18:37:07 -0800
      .format('ddd, D MMM YYYY HH:mm:ss ZZ')}\n`,
    function(err) {
      if (err) throw err
      console.log('Wrote expiration field!')

We’ve also leveraged this benefit in our Make targets where at build time we ensure that the deployed security.txt is clearsigned with the PGP key:

	gpg --local-user -o src/txt/security.txt --clearsign src/txt/security.txt.temp
	rm src/txt/security.txt.temp

And finally, leveraging the multi-route support in wrangler 1.8 alongside the request object that gets passed into our Worker, we can serve our PGP public key and security.txt at the same time on two routes using one Worker, differentiated based on what the eyeball is asking for:

import pubKey from './txt/security-cloudflare-public-06A67236.txt'
import securityTxt from './txt/security.txt'

const handleRequest = async request => {
  const { url } = request
  if (url.includes('/.well-known/security.txt')) {
    return new Response(securityTxt, {
      headers: { 'content-type': 'text/plain; charset=utf-8' }, // security.txt
  } else if (url.includes('/gpg/security-cloudflare-public-06A67236.txt')) {
    return new Response(pubKey, {
      headers: { 'content-type': 'text/plain; charset=utf-8' }, // GPG Public key
  return fetch(request) // Pass to origin

In the interest of transparency and to allow anyone to easily achieve the same wins we did, we’ve open sourced the Worker itself for anyone who wants to deploy this service onto their Cloudflare zone. It just takes a few minutes of time.

You can find the project on GitHub:

What’s next?

The security team at Cloudflare has grown a significant amount in the past year. This is most evident in terms of our headcount (and still growing!) but also in how we take on new projects. We’re hoping to share more stories like this in the future and open source some of the other security services that we are building on Workers to help others achieve the same security wins that we are achieving with the platform.


Storage management with Cockpit [Fedora Magazine]

Cockpit is a very useful utility allowing you to manage a compatible system over the network from the comfort of a web browser (See the list of supported web browsers and Linux distributions). One such feature is the ability to manage storage configuration. Cockpit contains a frontend for udisks2 – it allows you to create new partitions or format, resize, mount, unmount or delete existing partitions without the need to do it manually from a terminal.

Note: please exercise caution when managing your system disk and it’s partitions – incorrectly handling them may leave your system in unbootable state or incur data loss.

Installing Cockpit

If you don’t have Cockpit installed yet you can do so by issuing:

sudo dnf install cockpit

Note: Depending on your install profile, Cockpit might already be installed and you can skip the installation step! Also, some users may need to install cockpit-storaged package along with it’s dependencies if it has not been installed:

sudo dnf install cockpit-storaged

Add the service to the firewall:

sudo firewall-cmd --add-service=cockpit --permanent

Afterwards enable and start the service:

sudo systemctl enable cockpit.socket --now

And after this everything should be ready and cockpit should be accessible by entering the computers IP address or network domain name in the browser followed by the port 9090. For example: https://cockpit-example.localdomain:9090

Note: you will need to authenticate as privileged user to be able to modify your storage configuration, so tick the “Reuse my password for privileged tasks” checkbox on the Cockpit login page.

Basic provisioning of the storage device

Visiting the “Storage” section will display various statistics and information about the state of the system storage. You can find information about the partitions, their respective mountpoints, realtime disk read/write stats and storage related log information. Also, you can format and partition any newly attached internal/external storage device or attach an NFS mount.

To format and partition a blank storage device, select the device under “Devices” section by clicking on it. This will bring you to the screen of the selected storage device. Here you’ll be able to create a new partition table or format and create new partitions. if the device is empty Cockpit will describe the content of the storage device as unknown.

Click on “Create New Partition Table” to prepare the device.
After the partition table has been created, create one or more partitions by clicking “Create Partition” – here you’ll be able to specify the size, name, mountpoint and mount options.

When partitioning the storage device you have the choice between “Don’t owerwrite exiting data” and “Overwrite existing data with zeroes” – this will take slightly longer but is useful if you want to confidently erase the content of the storage device. Please note that this may not be enough for a substitute if your organisation has regulations in place how securely storage data must be erased. If needed, you can also specify custom mount options if defaults don’t suit your needs.

To simply create a single partition taking up all the storage space on the device just specify the name, for example, use “test” then specify it’s mountpoint, such as “/mnt/test” and click “Ok”. If you don’t want it to be immediately mounted uncheck the “Mount Now” checkbox. Specifying the name is optional, but will help you to identify the partition when inspecting the mountpoints. This will create a new XFS (the default recommended filesystem format) formatted partition “test” and mount it to “/mnt/test”.

Here’s an example how that would look like :

$ df -h
Filesystem                 Size      Used   Avail   Use% Mounted on
 /dev/mapper/fedora-root   15G       2.3G   13G     16%  /
 /dev/vda2                 1014M     185M   830M    19%  /boot
 /dev/vda1                 599M      8.3M   591M    2%   /boot/efi
 /dev/vdb1                 20G       175M   20G     1%   /mnt/test

It will also add the necessary entry to your /etc/fstab so that the partition gets mounted at boot.

Logical Volume Management

Cockpit also offers users to easily create and manage LVM and RAID storage devices. To create new Logical Volume Group, click on the burger menu button in the devices section and select Create Volume Group. Select the available storage device (only devices with unmounted or no partitions will show up) to finish the process and afterwards return to the storage section and select the newly created volume group.

From here on you’ll be able to create individual logical volumes by clicking Create new Logical Volume. Similarly to individual partitions, you can specify the size of the logical volume during creation if you don’t want to use all the available space of the volume group. After creating the logical volumes you’ll still need to format them and specify mountpoints. This can be done just like creating individual partitions was described earlier only instead of specifying individual disk devices you’re selecting logical volumes.

Here’s how a Logical Volume Group named “vgroup” with two Logical Volumes (lvol0 and lvol1) named “test” mounted on /mnt/test and named “data” mounted on /mnt/data would look like:

$ df -h
 Filesystem                 Size  Used Avail Use% Mounted on
 /dev/mapper/fedora-root     15G  2.3G   13G  16% /
 /dev/vda2                 1014M  185M  830M  19% /boot
 /dev/vda1                  599M  8.3M  591M   2% /boot/efi
 /dev/mapper/vgroup0-lvol0   10G  104M  9.9G   2% /mnt/test
 /dev/mapper/vgroup0-lvol1   10G  104M  9.9G   2% /mnt/data

Just like before – all the necessary information has been added to the configuration and should persist between system reboots.

Other storage related Cockpit features

Apart from the described features above Cockpit also allows you to mount iscsi disks and nfs mounts located on the network. However, these resources are usually hosted on a dedicated server and require additional configuration going beyond this article. At this time Cockpit itself doesn’t offer the ability for users to configure and serve iscsi and nfs mounts but this may subject to change as Cockpit is an open source project under active development.

Sunday, 22 March



Saturday Morning Breakfast Cereal - Magic [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Honestly, the original version of this story violates conservation of energy.

Today's News:

Saturday, 21 March


Saturday Morning Breakfast Cereal - Awkard AI [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Now that I think about it, awkward phases are probably evolution's way of creating appropriate reserves of shame.

Today's News:


Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements [The Cloudflare Blog]

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements
Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

The Cloudflare Load Balancer was introduced over three years ago to provide our customers with a powerful, easy to use tool to intelligently route traffic to their origins across the world. During the initial design process, one of the questions we had to answer was ‘where do we send traffic if all pools are down?’ We did not think it made sense just to drop the traffic, so we used the concept of a ‘fallback pool’ to send traffic to a ‘pool of last resort’ in the case that no pools were detected as available. While this may still result in an error, it gave an eyeball request a chance at being served successfully in case the pool was still up.

As a brief reminder, a load balancer helps route traffic across your origin servers to ensure your overall infrastructure stays healthy and available. Load Balancers are made up of pools, which can be thought of as collections of servers in a particular location.

Over the past three years, we’ve made many updates to the dashboard. The new designs now support the fallback pool addition to the dashboard UI. The use of a fallback pool is incredibly helpful in a tight spot, but not having it viewable in the dashboard led to confusion around which pool was set as the fallback. Was there a fallback pool set at all? We want to be sure you have the tools to support your day-to-day work, while also ensuring our dashboard is usable and intuitive.

You can now check which pool is set as the fallback in any given Load Balancer, along with being able to easily designate any pool in the Load Balancer as the fallback. If no fallback pool is set, then the last pool in the list will automatically be chosen. We made the decision to auto-set a pool to be sure that customers are always covered in case the worst scenario happens. You can access the fallback pool within the Traffic App of the Cloudflare dashboard when creating or editing a Load Balancer.

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

Load Balancing UI Improvements

Not only did we add the fallback pool to the UI, but we saw this as an opportunity to update other areas of the Load Balancing app that have caused some confusion in the past.

Facelift and De-modaling

As a start, we gave the main Load Balancing page a face lift as well as de-modaling (moving content out of a smaller modal screen into a larger area) the majority of the Load Balancing UI. We felt moving this content out of a small web element would allow users to more easily understand the content on the page and allow us to better use the larger available space rather than being limited to the small area of a modal. This change has been applied when you create or edit a Load Balancer and manage monitors and/or pools.


Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements


Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

The updated UI has combined the health status and icon to declutter the available space and make it clear at a glance what the status is for a particular Load Balancer or Pool. We have also updated to a smaller toggle button across the Load Balancing UI, which allows us to update the action buttons with the added margin space gained. Now that we are utilizing the page surface area more efficiently, we moved forward to add more information in our tables so users are more aware of the shared aspects of their Load Balancer.

Shared Objects and Editing

Shared objects have caused some level of concern for companies who have teams across the world - all leveraging the Cloudflare dashboard.

Some of the shared objects, Monitors and Pools, have a new column added outlining which Pools or Load Balancers are currently in use by a particular Monitor or Pool. This brings more clarity around what will be affected by any changes made by someone from your organization. This supports users to be more autonomous and confident when they make an update in the dashboard. If someone from team A wants to update a monitor for a production server, they can do so without the worry of monitoring for another pool possibly breaking or have to speak to team B first. The time saved and empowerment to make updates as things change in your business is incredibly valuable. It supports velocity you may want to achieve while maintaining a safe environment to operate in. The days of having to worry about unforeseen consequences that could crop up later down the road are swiftly coming to a close.

This helps teams understand the impact of a given change and what else would be affected. But, we did not feel this was enough. We want to be sure that everyone is confident in the changes they are making. On top of the additional columns, we added in a number of confirmation modals to drive confidence about a particular change. Also, a list in the modal of the other Load Balancers or Pools that would be impacted. We really wanted to drive the message home around which objects are shared: we made a final change to allow edits of monitors to take place only within the Manage Monitors page. We felt that having users navigate to the manage page in itself gives more understanding that these items are shared. For example, allowing edits to a Monitor in the same view of editing a Load Balancer can make it seem like those changes are only for that Load Balancer, which is not always the case.

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

Manage Monitors before:

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

Manage monitors after:

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

Updated CTAs/Buttons

Lastly, when users would expand the Manage Load Balancer table to view more details about their Pools or Origins within that specific Load Balancer, they would click the large X icon in the top right of that expanded card to close it - seems reasonable in the expanded context.

Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

But, the X icon did not close the expanded card, but rather deleted the Load Balancer altogether. This is dangerous and we want to prevent users from making mistakes. With the added space we gained from de-modaling large areas of the UI, we have updated these buttons to be clickable text buttons that read ‘Edit’ or ‘Delete’ instead of the icon buttons. The difference is providing clearly defined text around the action that will take place, rather than leaving it up to a users interpretation of what the icon on the button means and the action it would result in. We felt this was much clearer to users and not be met with unwanted changes.

We are very excited about the updates to the Load Balancing dashboard and look forward to improving day in and day out.


Adding the Fallback Pool to the Load Balancing UI and other significant UI enhancements

Friday, 20 March


The Serverlist: Workers Secrets, Serverless Supremacy, and more! [The Cloudflare Blog]

The Serverlist: Workers Secrets, Serverless Supremacy, and more!

Check out our thirteenth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.

Sign up below to have The Serverlist sent directly to your mailbox.


Saturday Morning Breakfast Cereal - Longevity [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

And then end with the last letter of this sentence I'm saying right now.

Today's News:


Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV [The Cloudflare Blog]

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV
Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

At Cloudflare, we produce all types of video content, ranging from recordings of our Weekly All-Hands to product demos. Being able to stream video on demand has two major advantages when compared to live video:

  1. It encourages asynchronous communication within the organization
  2. It extends the life time value of the shared knowledge

Historically, we haven’t had a central, secure repository of all video content that could be easily accessed from the browser. Various teams choose their own platform to share the content. If I wanted to find a recording of a product demo, for example, I’d need to search Google Drive, Gmail and Google Chat with creative keywords. Very often, I would need to reach out to individual teams to finally locate the content.

So we decided we wanted to build CloudflareTV, an internal Netflix-like application that can only be accessed by Cloudflare employees and has all of our videos neatly organized and immediately watchable from the browser.

We wanted to achieve the following when building CloudflareTV:

  • Security: make sure the videos are access controlled and not publicly accessible
  • Authentication: ensure the application can only be accessed by Cloudflare employees
  • Tagging: allow the videos to be categorized so they can be found easily
  • Originless: build the entire backend using Workers and Stream so we don’t need separate infrastructure for encoding, storage and delivery

Securing the videos using signed URLs

Every video uploaded to Cloudflare Stream can be locked down by requiring signed URLs. A Stream video can be marked as requiring signed URLs using the UI or by making an API call:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Once locked down in this way videos can’t be accessed directly. Instead, they can only be accessed using a temporary token.

In order to create signed tokens, we must first make an API call to create a key:

curl -X POST -H "X-Auth-Email: {$EMAIL}" -H "X-Auth-Key: {$AUTH_KEY}"  "{$ACCOUNT_ID}/media/keys"

The API call will return a JSON object similar to this:

  "result": {
    "id": "...",
    "pem": "...",
    "jwk": "...",
    "created": "2020-03-10T18:17:00.075188052Z"
  "success": true,
  "errors": [],
  "messages": []

We can use the id and pem values in a Workers script that takes a video ID and returns a signed token that expires after 1 hour:

async function generateToken(video_id) {
var exp_time = Math.round((new Date()).getTime() / 1000)+3600;

    const key_data = {
        'id': '{$KEY_ID}',
        'pem': '{$PEM}',
        'exp': exp_time

    let response = await fetch(''+video_id, {
        method: 'POST',
        body: JSON.stringify(key_data)
    let token_value = await response.text();
    return token_value;

The returned signed token should look something like this:


Stream provides an embed code for each video. The “src” attribute of the embed code typically contains the video ID. But if the video is private, instead of setting the “src” attribute to the video ID, you set it to the signed token value:

<stream src="eyJhbGciOiJSUzI1NiIsImtpZCI6IjExZDM5ZjEwY2M0NGY1NGE4ZDJlMjM5OGY3YWVlOGYzIn0.eyJzdWIiOiJiODdjOWYzOTkwYjE4ODI0ZTYzMTZlMThkOWYwY2I1ZiIsImtpZCI6IjExZDM5ZjEwY2M0NGY1NGE4ZDJlMjM5OGY3YWVlOGYzIiwiZXhwIjoiMTUzNzQ2MDM2NSIsIm5iZiI6IjE1Mzc0NTMxNjUifQ.C1BEveKi4XVeZk781K8eCGsMJrhbvj4RUB-FjybSm2xiQntFi7AqJHmj_ws591JguzOqM1q-Bz5e2dIEpllFf6JKK4DMK8S8B11Vf-bRmaIqXQ-QcpizJfewNxaBx9JdWRt8bR00DG_AaYPrMPWi9eH3w8Oim6AhfBiIAudU6qeyUXRKiolyXDle0jaP9bjsKQpqJ10K5oPWbCJ4Nf2QHBzl7Aasu6GK72hBsvPjdwTxdD5neazdxViMwqGKw6M8x_L2j2bj93X0xjiFTyHeVwyTJyj6jyPwdcOT5Bpuj6raS5Zq35qgvffXWAy_bfrWqXNHiQdSMOCNa8MsV8hljQsh" controls></stream>
<script data-cfasync="false" defer type="text/javascript" src=""></script>

Tagging videos

We would like to categorize videos uploaded to Stream by tagging them. This can be done by updating the video object’s meta field and passing it arbitrary JSON data. To categorize a video, we simply update the meta field with a comma-delimited list of tags:

curl -X POST  -d '{"uid": "VIDEO_ID", "meta": {"tags": "All Hands,Stream"}}' "{$ACCOUNT_ID}/stream/{$VIDEO_ID}"  -H "X-Auth-Email: {$EMAIL}"  -H "X-Auth-Key: {$ACCOUNT_KEY}"  -H "Content-Type: application/json"

Later, we will create a getVideos Worker function to fetch a list of videos and all associated data so we can render the UI. The tagging data we just set for this video will be included in the video data returned by the Worker.

Fetching Video Data using Workers

The heart of the UI is a list of videos. How do we get this list of videos programmatically? Stream provides an endpoint that returns all the videos and any metadata associated with them.

First, we set up environment variables for our Worker:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Next, we wrote a simple Workers function to call the Stream API and return a list of videos, eliminating the need for an origin:

async function getVideos() {
    const headers = {
        'X-Auth-Key': CF_KEY,
        'X-Auth-Email': CF_EMAIL

    let response = await fetch(“” + CF_ACCOUNT_ID + '/stream', {
        headers: headers
    let video_list = await response.text();
    return video_list;

Lastly, we set up a zone and within the zone, we set up a Worker routes pointing to our Workers script. This can be done from the Workers tab:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Authenticating using Cloudflare Access

Finally, we want to restrict access to CloudflareTV to people within the organization. We can do this using Cloudflare Access, available under the Access tab.

To restrict access to CloudflareTV, we must do two things:

  1. Add a new login method
  2. Add an access policy

To add a new login method, click the “+” icon and choose your identity provider. In our case, we chose Google:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

You will see a pop up asking for information including Client ID and Client Secret, both key pieces of information required to set up Google as the identity provider.

Once we add an identity provider, we want to tell Access “who specifically should be allowed to access our application?” This is done by creating an Access Policy.

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV
Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

We set up an Access Policy to only allow emails ending in our domain name. This effectively makes CloudflareTV only accessible by our team!

What’s next?

If you have interesting ideas around video, Cloudflare Stream lets you focus on your idea while it handles storage, encoding and the viewing experience for your users. Coupled that with Access and Workers, you can build powerful applications. Here are the docs to help you get started:


Control the firewall at the command line [Fedora Magazine]

A network firewall is more or less what it sounds like: a protective barrier that prevents unwanted network transmissions. They are most frequently used to prevent outsiders from contacting or using network services on a system. For instance, if you’re running a laptop at school or in a coffee shop, you probably don’t want strangers poking around on it.

Every Fedora system has a firewall built in. It’s part of the network functions in the Linux kernel inside. This article shows you how to change its settings using firewall-cmd.

Network basics

This article can’t teach you everything about computer networks. But a few basics suffice to get you started.

Any computer on a network has an IP address. Think of this just like a mailing address that allows correct routing of data. Each computer also has a set of ports, numbered 0-65535. These are not physical ports; instead, you can think of them as a set of connection points at the address.

In many cases, the port is a standard number or range depending on the application expected to answer. For instance, a web server typically reserves port 80 for non-secure HTTP communications, and/or 443 for secure HTTPS. The port numbers under 1024 are reserved for system and well-known purposes, ports 1024-49151 are registered, and ports 49152 and above are usually ephemeral (used only for a short time).

Each of the two most common protocols for Internet data transfer, TCP and UDP, have this set of ports. TCP is used when it’s important that all data be received and, if it arrives out of order, reassembled in the right order. UDP is used for more time-sensitive services that can withstand losing some data.

An application running on the system, such as a web server, reserves one or more ports (as seen above, 80 and 443 for example). Then during network communication, a host establishes a connection between a source address and port, and the destination address and port.

A network firewall can block or permit transmissions of network data based on rules like address, port, or other criteria. The firewall-cmd utility lets you interact with the rule set to view or change how the firewall works.

Firewall zones

To verify the firewall is running, use this command with sudo. (In fairness, you can run firewall-cmd without the sudo command in environments where PolicyKit is running.)

$ sudo firewall-cmd --state

The firewalld service supports any number of zones. Each zone can have its own settings and rules for protection. In addition, each network interface can be placed in any zone individually The default zone for an external facing interface (like the wifi or wired network card) on a Fedora Workstation is the FedoraWorkstation zone.

To see what zones are active, use the ––get-active-zones flag. On this system, there are two network interfaces, a wired Ethernet card wlp2s0 and a virtualization (libvirt) bridge interface virbr0:

$ sudo firewall-cmd --get-active-zones
  interfaces: wlp2s0
  interfaces: virbr0

To see the default zone, or all the defined zones:

$ sudo firewall-cmd --get-default-zone
$ sudo firewall-cmd --get-zones
FedoraServer FedoraWorkstation block dmz drop external home internal libvirt public trusted work

To see the services the firewall is allowing other systems to access in the default zone, use the ––list-services flag. Here is an example from a customized system; you may see something different.

$ sudo firewall-cmd --list-services
dhcpv6-client mdns samba-client ssh

This system has four services exposed. Each of these has a well-known port number. The firewall recognizes them by name. For instance, the ssh service is associated with port 22.

To see other port settings for the firewall in the current zone, use the ––list-ports flag. By the way, you can always declare the zone you want to check:

$ sudo firewall-cmd --list-ports --zone=FedoraWorkstation
1025-65535/udp 1025-65535/tcp

This shows that ports 1025 and above (both UDP and TCP) are open by default.

Changing zones, ports, and services

The above setting is a design decision.* It ensures novice users can use network facing applications they install. If you know what you’re doing and want a more protective default, you can move the interface to the FedoraServer zone, which prohibits any ports not explicitly allowed. (Warning: if you’re using the host via the network, you may break your connection — meaning you’ll have to go to that box physically to make further changes!)

$ sudo firewall-cmd --change-interface=<ifname> --zone=FedoraServer

* This article is not the place to discuss that decision, which went through many rounds of review and debate in the Fedora community. You are welcome to change settings as needed.

If you want to open a well-known port that belongs to a service, you can add that service to the default zone (or use ––zone to adjust a different zone). You can add more than one at once. This example opens up the well-known ports for your web server for both HTTP and HTTPS traffic, on ports 80 and 443:

$ sudo firewall-cmd --add-service=http --add-service=https

Not all services are defined, but many are. To see the whole list, use the ––get-services flag.

If you want to add specific ports, you can do that by number and protocol as well. (You can also combine ––add-service and ––add-port flags, as many as necessary.) This example opens up the UDP service for a network boot service:

$ sudo firewall-cmd --add-port=67/udp

Important: If you want your changes to be effective after you reboot your system or restart the firewalld service, you must add the ––permanent flag to your commands. The examples here only change the firewall until one of those events next happens.

These are just some of the many functions of the firewall-cmd utility and the firewalld service. There is much more information on firewalld at the project’s home page that’s worth reading and trying out.

Photo by Jakob Braun on Unsplash.

Thursday, 19 March


Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely [The Cloudflare Blog]

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

This week, like many of you reading this article, I am working from home. I don’t know about you, but I’ve found it hard to stay focused when the Internet is full of news related to the coronavirus.

CNN. Twitter. Fox News. It doesn’t matter where you look, everyone is vying for your attention. It’s totally riveting…

… and it’s really hard not to get distracted.

It got me annoyed enough that I decided to do something about it. Using Cloudflare’s new product, Cloudflare Gateway, I removed all the online distractions I normally get snared by — at least during working hours.

This blog post isn’t very long, but that’s a function of how easy it is to get Gateway up and running!

Getting Started

To get started, you’ll want to set up Gateway under your Cloudflare account. Head to the Cloudflare for Teams dashboard to set it up for free (if you don’t already have a Cloudflare account, hit the ‘Sign up’ button beneath the login form).

If you are using Gateway for the first time, the dashboard will take you through an onboarding experience:

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

The onboarding flow will help you set up your first location. A location is usually a physical entity like your home, office, store or a data center.

When you are setting up your location, the dashboard will automatically identify your IP address and create a location using that IP. Gateway will associate requests from your router or device by matching requests with your location by using the linked IP address of your location (for an IPv4 network). If you are curious, you can read more about how Gateway determines your location here.

Before you complete the setup you will have to change your router’s DNS settings by removing the existing DNS resolvers and adding Cloudflare Gateway’s recursive DNS resolvers:


How you configure your DNS settings may vary by router or a device, so we created a page to show you how to change DNS settings for different devices.

You can also watch this video to learn how to setup Gateway:

Deep Work

Next up, in the dashboard, I am going to go to my policies and create a policy that will block my access to distracting sites. You can call your policy anything you want, but I am going to call mine “Deep work.”

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

And I will add a few websites that I don’t want to get distracted by, like CNN, Fox News and Twitter.

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

After I add the domains, I hit Save.

If you find the prospect of blocking all of these websites cumbersome, you can use category-based DNS filtering to block all domains that are associated with a category (‘Content categories’ have limited capabilities on Gateway’s free tier).

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

So if I select Sports, all websites that are related to Sports will now be blocked by Gateway. This will take most people a few minutes to complete.

And once you set the rules by hitting ‘Save’, it will take just seconds for the selected policies to propagate across all of Cloudflare’s data centers, spread across more than 200 cities around the world.

How can I test if Gateway is blocking the websites?

If you now try to go to one of the blocked websites, you will see the following page on your browser:

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

Cloudflare Gateway is letting your browser know that the website you blocked is unreachable. You can also test if Gateway is working by using dig or nslookup on your machine:

Using Cloudflare Gateway to Stay Productive (and turn off distractions) While Working Remotely

If a domain is blocked, you will see the following in the DNS response status: REFUSED.

This means that the policy you created is working!

And once working hours are over, it’s back to being glued to the latest news.

If you’d rather watch this in video format, here’s one I recorded earlier:

And to everyone dealing with the challenges of COVID-19 and working from home — stay safe!


Saturday Morning Breakfast Cereal - Literally [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

You can also do this by making up titles of novels or songs for people who claim to hate everything by a particular artist.

Today's News:

Wednesday, 18 March


Saturday Morning Breakfast Cereal - Groups [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Once again, in case you missed it, you can get pretty much every book of mine to which I own the rights, for free, here:

Today's News:

Tuesday, 17 March


Announcing the release of Fedora 32 Beta [Fedora Magazine]

The Fedora Project is pleased to announce the immediate availability of Fedora 32 Beta, the next step towards our planned Fedora 32 release at the end of April.

Download the prerelease from our Get Fedora site:

Or, check out one of our popular variants, including KDE Plasma, Xfce, and other desktop environments, as well as images for ARM devices like the Raspberry Pi 2 and 3:

Beta Release Highlights

Fedora Workstation

New in Fedora 32 Workstation Beta is EarlyOOM enabled by default. EarlyOOM enables users to more quickly recover and regain control over their system in low-memory situations with heavy swap usage. Fedora 32 Workstation Beta also enables the fs.trim timer by default, which improves performance and wear leveling for solid state drives.

Fedora 32 Workstation Beta includes GNOME 3.36, the newest release of the GNOME desktop environment. It is full of performance enhancements and improvements. GNOME 3.36 adds a Do Not Disturb button in the notifications, improved setup for parental controls and virtualization, and tweaks to Settings. For a full list of GNOME 3.36 highlights, see the release notes.

Other updates

Fedora 32 Beta includes updated versions of many popular packages like Ruby, Python, and Perl. It also includes version 10 of the popular GNU Compiler Collection (GCC). We also have the customary updates to underlying infrastructure software, like the GNU C Library. For a full list, see the Change set on the Fedora Wiki.

Testing needed

Since this is a Beta release, we expect that you may encounter bugs or missing features. To report issues encountered during testing, contact the Fedora QA team via the mailing list or in the #fedora-qa channel on IRC Freenode. As testing progresses, common issues are tracked on the Common F32 Bugs page.

For tips on reporting a bug effectively, read how to file a bug.

What is the Beta Release?

A Beta release is code-complete and bears a very strong resemblance to the final release. If you take the time to download and try out the Beta, you can check and make sure the things that are important to you are working. Every bug you find and report doesn’t just help you, it improves the experience of millions of Fedora users worldwide! Together, we can make Fedora rock-solid. We have a culture of coordinating new features and pushing fixes upstream as much as we can. Your feedback improves not only Fedora, but Linux and free software as a whole.

More information

For more detailed information about what’s new on Fedora 32 Beta release, you can consult the Fedora 32 Change set. It contains more technical information about the new packages and improvements shipped with this release.

Photo by Josh Calabrese on Unsplash.


Saturday Morning Breakfast Cereal - Hold Music [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Hey everyone stuck at home - for the duration of Covid Party 2020, I'm making a bunch of ebooks free. Just click the link in the below-comic blog.

Today's News:


Monday, 16 March


Fedora community and the COVID-19 crisis [Fedora Magazine]

[This message comes directly from the desk of Matthew Miller, the Fedora Project Leader.  — Ed.] 

Congratulations to the Fedora community for the upcoming on-time release of Fedora 32 Beta. While we’ve gotten better at hitting our schedule over the years, it’s always nice to celebrate  a little bit each time we do. But that may not be what’s on your mind this week. Like you, I’ve been thinking a lot about the global COVID-19 pandemic. During the Beta period, many of us were unaffected by this outbreak, but as the effects intensify around the world, the month between now and the final release will be different.

“Friends” is the first of our Four Foundations for a reason: Fedora is a community. The most important Fedora concerns right now are your health and safety. Many of you are asked to work from home, to practice social distancing, or even to remain under quarantine. For some of you, this will mean more time to contribute to your favorite open source projects. For others, you have additional stress as partners, kids, and others in your life require additional care. For all of us, the uncertainty weighs on our minds.

I want to make one thing very clear: do not feel bad if you cannot contribute to the level you want to. We always appreciate what you do for the Fedora community, but your health — both physical and mental — is more important than shipping a release. As of right now, we’re planning to continue on schedule, but we understand that the situation is changing rapidly. We’re working on contingency plans and the option of delaying the Fedora 32 release remains on the table.

As you may already know, the Fedora Council has decided to refrain from sponsoring events through the end of the May. We will continue to re-evaluate this as the global situation changes. Please follow the directions of your local public health authorities and keep yourself safe.


Saturday Morning Breakfast Cereal - Cat [Saturday Morning Breakfast Cereal]

Click here to go see the bonus panel!

Human is good. Human is good. That is why human is.

Today's News:


Connect your Google Drive to Fedora Workstation [Fedora Magazine]

There are plenty of cloud services available where you can store important documents. Google Drive is undoubtedly one of the most popular. It offers a matching set of applications like Docs, Sheets, and Slides to create content. But you can also store arbitrary content in your Google Drive. This article shows you how to connect it to your Fedora Workstation.

Adding an account

Fedora Workstation lets you add an account either after installation during first startup, or at any time afterward. To add your account during first startup, follow the prompts. Among them is a choice of accounts you can add:

Online account listing

Select Google and a login prompt appears for you to login, so use your Google account information.

Online account login dialog

Be aware this information is only transmitted to Google, not to the GNOME project. The next screen asks you to grant access, which is required so your system’s desktop can interact with Google. Scroll down to review the access requests, and choose Allow to proceed.

You can expect to receive notifications on mobile devices and Gmail that a new device — your system — accessed your Google account. This is normal and expected.

Online account access request dialog

If you didn’t do this at first startup, or you need to re-add your account, open the Settings tool, and select Online Accounts to add the account. The Settings tool is available through the dropdown at right side of the Top Bar (the “gear” icon), or by opening the Overview and typing settings. Then proceed as described above.

Using the Files app with Google Drive

Open the Files app (formerly known as nautilus). Locations the Files app can access appear on the left side. Locate your Google account in the list.

When you select this account, the Files app shows the contents of your Google drive. Some files can be opened using your Fedora Workstation local apps, such as sound files or LibreOffice-compatible files (including Microsoft Office docs). Other files, such as Google app files like Docs, Sheets, and Slides, open using your web browser and the corresponding app.

Remember that if the file is large, it will take some time to receive over the network so you can open it.

You can also copy and paste files in your Google Drive storage from or to other storage connected to your Fedora Workstation. You can also use the built in functions to rename files, create folders, and organize them. For sharing and other advanced options, use Drive from your browser per normal.

Be aware that the Files app does not refresh contents in real time. If you add or remove files from other Google connected devices like your mobile phone or tablet, you may need to hit Ctrl+R to refresh the Files app view.

Photo by Beatriz Pérez Moya on Unsplash.

Sunday, 15 March

Friday, 13 March


Submit a supplemental wallpaper for Fedora 32 [Fedora Magazine]

Attention Fedora community members: Fedora is seeking submissions for supplemental wallpapers to be included with the Fedora 32 release. Whether you’re an active contributor, or have been looking for a easy way to get started contributing, submitting a wallpaper is a great way to help. Read on for more details.

Each release, the Fedora Design Team works with the community on a set of 16 additional wallpapers. Users can install and use these to supplement the standard wallpaper.

Dates and deadlines

The submission phase opened as of March 7, 2020 and ends March 21, 2020 at 23:59 UTC.

Important note: In some circumstances, submissions during the final hours may not get into the election, if there is insufficient time to do legal research. Please help by following the guidelines correctly, and submit only work under a correct license.

The voting phase will open the Monday following the close of submissions, March 23, 2020, and will be open until the end of the month on March 31, 2020 at 23:59 UTC.

How to contribute a wallpaper

Fedora uses the Nuancier application to manage the submissions and the voting process. To submit, you need a Fedora account. If you don’t have one, create one here in the Fedora Account System (FAS). To vote you must have a signed contributor agreement (also accessible in FAS) which only takes a few moments.

You can access Nuancier here along with detailed instructions for submissions.

Thursday, 12 March


Fedora shirts and sweatshirts from HELLOTUX [Fedora Magazine]

Linux clothes specialist HELLOTUX from Europe recently signed an agreement with Red Hat to make embroidered Fedora t-shirts, polo shirts and sweatshirts. They have been making Debian, Ubuntu, openSUSE, and other Linux shirts for more than a decade and now the collection is extended to Fedora.

UPDATE: There’s a special coupon code below that provides some savings on an order. Read onward for more details!

Embroidered Fedora polo shirt.

Instead of printing, they use programmable embroidery machines to make the Fedora embroidery. All of the design work is made exclusively with Linux; this is a matter of principle.

Some photos of the embroidering process for a Fedora sweatshirt:

You can get Fedora polos and t-shirts in blue or black and the sweatshirt in gray here.

Oh, “just one more thing,” as Columbo used to say: you can get a $5/€5 discount on your order when you use the coupon code FEDORA5. Order directly from the HelloTux website.

[Editor’s note: Updated on 4 March 2020 at 1745 UTC to add the coupon code.]

Wednesday, 11 March


Using the Quarkus Framework on Fedora Silverblue – Just a Quick Look [Fedora Magazine]

Quarkus is a framework for Java development that is described on their web site as:

A Kubernetes Native Java stack tailored for OpenJDK HotSpot and GraalVM, crafted from the best of breed Java libraries and standards – Feb. 5, 2020

Silverblue — a Fedora Workstation variant with a container based workflow central to its functionality — should be an ideal host system for the Quarkus framework.

There are currently two ways to use Quarkus with Silverblue. It can be run in a pet container such as Toolbox/Coretoolbox. Or it can be run directly in a terminal emulator. This article will focus on the latter method.

Why Quarkus

According to “Quarkus has been designed around a containers first philosophy. What this means in real terms is that Quarkus is optimized for low memory usage and fast startup times.” To achieve this, they employ first class support for Graal/Substrate VM, build time Metadata processing, reduction in reflection usage, and native image preboot. For details about why this matters, read Container First at Quarkus.


A few prerequisites will need to configured before you can start using Quarkus. First, you need an IDE of your choice. Any of the popular ones will do. VIM or Emacs will work as well. The Quarkus site provides full details on how to set up the three major Java IDE’s (Eclipse, Intellij Idea, and Apache Netbeans). You will need a version of JDK installed. JDK 8, JDK 11 or any distribution of OpenJDK is fine. GrallVM 19.2.1 or 19.3.1 is needed for compiling down to native. You will also need Apache Maven 3.53+ or Gradle. This article will use Maven because that is what the author is more familiar with. Use the following command to layer Java 11 OpenJDK and Maven onto Silverblue:

$ rpm-ostree install java-11-openjdk* maven

Alternatively, you can download your favorite version of Java and install it directly in your home directory.

After rebooting, configure your JAVA_HOME and PATH environment variables to reference the new applications. Next, go to the GraalVM download page, and get GraalVM version 19.2.1 or version 19.3.1 for Java 11 OpenJDK. Install Graal as per the instructions provided. Basically, copy and decompress the archive into a directory under your home directory, then modify the PATH environment variable to include Graal. You use it as you would any JDK. So you can set it up as a platform in the IDE of your choice. Now is the time to setup the native image if you are going to use one. For more details on setting up your system to use Quarkus and the Quarkus native image, check out their Getting Started tutorial. With these parts installed and the environment setup, you can now try out Quarkus.


Quarkus recommends you create a project using the bootstrapping method. Below are some example commands entered into a terminal emulator in the Gnome shell on Silverblue.

$ mvn io.quarkus:quarkus-maven-plugin:1.2.1.Final:create \
    -DprojectGroupId=org.jakfrost \
    -DprojectArtifactId=silverblue-logo \
    -DclassName="org.jakfrost.quickstart.GreetingResource" \
$ cd silverblue-logo

The bootstrapping process shown above will create a project under the current directory with the name silverblue-logo. After this completes, start the application in development mode:

$ ./mvnw compile quarkus:dev

With the application running, check whether it responds as expected by issuing the following command:

$ curl -w '\n' http://localhost:8080/hello

The above command should print hello on the next line. Alternatively, test the application by browsing to http://localhost:8080/hello with your web browser. You should see the same lonely hello on an otherwise empty page. Leave the application running for the next section.


Open the project in your favorite IDE. If you are using Netbeans, simply open the project directory where the pom.xml file resides. Now would be a good time to have a look at the pom.xml file.

Quarkus uses ArC for its dependency injection. ArC is a dependency of quarkus-resteasy, so it is already part of the core Quarkus installation. Add a companion bean to the project by creating a java class in your IDE called Then put the following code into it:

import javax.enterprise.context.ApplicationScoped;

public class GreetingService {

    public String greeting(String name) {
        return "hello " + name;


The above code is a verbatim copy of what is used in the injection example in the Quarkus Getting Started tutorial. Modify by adding the following lines of code:

import javax.inject.Inject;
import org.jboss.resteasy.annotations.jaxrs.PathParam;

    GreetingService service;//inject the service

    @GET //add a getter to use the injected service
    public String greeting(@PathParam String name) {
        return service.greeting(name);

If you haven’t stopped the application, it will be easy to see the effect of your changes. Just enter the following curl command:

$ curl -w '\n' http://localhost:8080/hello/greeting/Silverblue

The above command should print hello Silverblue on the following line. The URL should work similarly in a web browser. There are two important things to note:

  1. The application was running and Quarkus detected the file changes on the fly.
  2. The injection of code into the app was very easy to perform.

The native image

Next, package your application as a native image that will work in a podman container. Exit the application by pressing CTRL-C. Then use the following command to package it:

$ ./mvnw package -Pnative -Dquarkus.native.container-runtime=podman

Now, build the container:

$ podman build -f src/main/docker/Dockerfile.native -t silverblue-logo/silverblue-logo

Now run it with the following:

$ podman run -i --rm -p 8080:8080 localhost/silverblue-logo/silverblue-logo

To get the container build to successfully complete, it was necessary to copy the /target directory and contents into the src/main/docker/ directory. Investigation as to the reason why is still required, and though the solution used was quick and easy, it is not an acceptable way to solve the problem.

Now that you have the container running with the application inside, you can use the same methods as before to verify that it is working.

Point your browser to the URL http://localhost:8080/ and you should get a index.html that is automatically generated by Quarkus every time you create or modify an application. It resides in the src/main/resources/META-INF/resources/ directory. Drop other HTML files in this resources directory to have Quarkus serve them on request.

For example, create a file named logo.html in the resources directory containing the below markup:

<!DOCTYPE html>
To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
            <img src="fedora-silverblue-logo.png" alt="Fedora Silverblue"/>

Next, save the below image alongside the logo.html file with the name fedora-silverblue-logo.png:

Now view the results at http://localhost:8080/logo.html.

Testing your application

Quarkus supports junit 5 tests. Look at your project’s pom.xml file. In it you should see two test dependencies. The generated project will contain a simple test, named Testing for the native file is only supported in prod mode. However, you can test the jar file in dev mode. These tests are RestAssured, but you can use whatever test library you wish with Quarkus. Use Maven to run the tests:

$ ./mvnw test

More details can be found in the Quarkus Getting Started tutorial.

Further reading and tutorials

Quarkus has an extensive collection of tutorials and guides. They are well worth the time to delve into the breadth of this microservices framework.

Quarkus also maintains a publications page that lists some very interesting articles on actual use cases of Quarkus. This article has only just scratched the surface of the topic. If what was presented here has piqued your interest, then follow the above links for more information.

Tuesday, 10 March


Monday, 09 March


Fish – A Friendly Interactive Shell [Fedora Magazine]

Are you looking for an alternative to bash? Are you looking for something more user-friendly? Then look no further because you just found the golden fish!

Fish (friendly interactive shell) is a smart and user-friendly command line shell that works on Linux, MacOS, and other operating systems. Use it for everyday work in your terminal and for scripting. Scripts written in fish are less cryptic than their equivalent bash versions.

Fish’s user-friendly features

  • Suggestions
    Fish will suggest commands that you have written before. This boosts productivity when typing same commands often.
  • Sane scripting
    Fish avoids using cryptic characters. This provides a clearer and friendlier syntax.
  • Completion based on man pages
    Fish will autocomplete parameters based on the the command’s man page.
  • Syntax highlighting
    Fish will highlight command syntax to make it visually friendly.


Fedora Workstation

Use the dnf command to install fish:

$ sudo dnf install fish

Make fish your default shell by installing the util-linux-user package and then running the chsh (change shell) command with the appropriate parameters:

$ sudo dnf install util-linux-user
$ chsh -s /usr/bin/fish

You will need to log out and back in for this change to take effect.

Fedora Silverblue

Because this is not GUI application, you will need to layer it using rpm-ostree. Use the following command to install fish on Fedora Silverblue:

$ rpm-ostree install fish

On Fedora Silverblue you will need to reboot your PC to switch to the new ostree image.

If you want to make fish your main shell on Fedora Silverblue, the easiest way is to update the /etc/passwd file. Find your user and change /bin/bash to /usr/bin/fish.

You will need root privileges to edit the /etc/passwd file. Also you will need to log out and back in for this change to take effect.


The per-user configuration file for fish is ~/.config/fish/ To make configuration changes for all users, edit /etc/fish/ instead.

The per-user configuration file must be created manually. The installation scripts will not create ~/.config/fish/

Here are a couple configuration examples shown alongside their bash equivalents to get you started:

Creating aliases

  • ~/.bashrc:
    alias ll='ls -lh'
  • ~/.config/fish/
    alias ll='ls -lh'

Setting environment variables

  • ~/.bashrc:
    export PATH=$PATH:~/bin
  • ~/.config/fish/
    set -gx PATH $PATH ~/bin

Working with fish

When fish is configured as your default shell, the command prompt will look similar to what is shown in the below image. If you haven’t configured fish to be your default shell, just run the fish command to start it in your current terminal session.

As you start typing commands, you will notice the syntax highlighting:

Cool, isn’t it? :-)

You will also see commands being suggested as you type. For example, start typing the previous command a second time:

Notice the gray text that appears as you type. The gray text is fish suggesting the command you wrote before. To autocomplete it, just press CTRL+F.

Get argument suggestions based on the preceding command’s man page by typing a dash () and then the TAB key:

If you press TAB once, it will show you the first few suggestions (or every suggestion, if there are only a few arguments available). If you press TAB a second time, it will show you all suggestions. If you press TAB three times consecutively, it will switch to interactive mode and you can select an argument using the arrow keys.

Otherwise, fish works similar to most other shells. The remaining differences are well documented. So it shouldn’t be difficult to find other features that you may be interested in.

Make fish even more powerful

Make the fish even more powerful with powerline. Powerline adds command execution time, colored git status, current git branch and much more to fish’s interface.

Before installing powerline for fish, you must install Oh My Fish. Oh My Fish extends fish’s core infrastructure to enable the installation of additional plugins. The easiest way to install Oh My Fish is to use the curl command:

> curl -L | fish

If you don’t want to pipe the installation commands directly to curl, see the installation section of Oh My Fish’s README for alternative installation methods.

Fish’s powerline plugin is bobthefish. Bobthefish requires the powerline-fonts package.

On Fedora Workstation:

> sudo dnf install powerline-fonts

On Fedora Silverblue:

> rpm-ostree install powerline-fonts

On Fedora Silverblue you will have to reboot to complete the installation of the fonts.

After you have installed the powerline-fonts package, install bobthefish:

> omf install bobthefish

Now you can experience the full awesomeness of fish with powerline:

Additional resources

Check out these web pages to learn even more about fish:

Sunday, 08 March

Thursday, 05 March


An Ever Evolving Company Requires an Ever Evolving Communication Plan [Yelp Engineering and Product Blog]

An Ever-Evolving Company Requires an Ever-Evolving Communication Plan It’s 2014 and your teams are divided by platform, something like: Web, Mobile Web, Android, and iOS. In order to launch features, product managers jump from platform to platform and teams move fast. Really fast. Lines of code in each repository increase to the point where you now name them “monoliths.” A few engineers maintain these monoliths when they need to, but no one is solely dedicated to the task. Engineers are distributed by platform; so communication on when to maintain the monoliths is easy, but presents another problem. Can you continue...


Manage tasks and projects on Fedora with Taskwarrior [Fedora Magazine]

There are a multitude of applications to manage your todo list. One of these apps is Taskwarrior, it allows you to manage your task in the terminal without a GUI. This article will show you how to get started using it.

What is Taskwarrior?

Taskwarrior is CLI task manager and organizer. It is flexible, fast, and unobtrusive. It does its job then gets out of your way.

Taskwarrior uses $HOME/.taskrc and $HOME/.task to store your settings and tasks respectively.

Getting started with Taskwarrior

It’s easy to use the Taskwarrior to add your daily missions. These are some simple commands. To add tasks:

$ task add buy milk 
Created task 1. 
$ task add buy eggs
 Created task 2. 
$ task add bake cake 
Created task 3.

To list your tasks, you can use the task command on its own for the simplest listing:

$ task 

ID Age Description    Urg
  1 17s buy milk       0
  2 14s buy eggs       0
  3 11s bake cake      0

3 tasks.

To mark a task as complete, use the done keyword:

$ task 1 done
 Completed task 1 'buy milk'.
 Completed 1 task.
$ task 2 done
 Completed task 2 'buy eggs'.
 Completed 1 task.
$ task
 [task next]
 ID Age Description Urg
  1 57s bake cake      0
 1 task

Diving deeper into Taskwarrior

Priority management

Taskwarrior (task) is designed to help prioritize your tasks. To do this, task has multiple implicit and explicit variables it can use to determine an “Urgency” value.

Consider the following list.

$ task
 [task next]
 ID Age  Description    Urg
  1 2min buy eggs          0
  2 2min buy flour         0
  3 2min bake cake         0
  4 2min pay rent          0
  5 3s   install fedora    0
 5 tasks

One could argue that paying your rent and installing Fedora have a higher priority than baking a cake. You can tell task about this by using the pri modifier.

$ task 4 mod pri:H
 Modifying task 4 'pay rent'.
 Modified 1 task.
$ task 5 mod pri:M
 Modifying task 5 'install fedora'.
 Modified 1 task.
$ task
 [task next]
 ID Age  P Description    Urg
  4 4min H pay rent          6
  5 2min M install fedora  3.9
  1 4min   buy eggs          0
  2 4min   buy flour         0
  3 4min   bake cake         0
 5 tasks

Rent is very important, it has a due date that we need to pay it by, such as within 3 days from the 1st of the month. You can tell task this by using the due modifier.

$ task 4 mod due:3rd
 Modifying task 4 'pay rent'.
 Modified 1 task.
$ task
 [task next]
 ID Age   P Due Description    Urg
  4 12min H 2d  pay rent       13.7
  5 10min M     install fedora  3.9
  1 12min       buy eggs          0
  2 12min       buy flour         0
  3 12min       bake cake         0
 5 tasks
$ date
 Sat Feb 29 11:59:29 STD 2020

Because the 3rd of next month is nearby, the urgency value of rent has skyrocketed, and will continue to do so once we have reached and passed the due date.

However, not all tasks need to be done right away. Say for example you don’t want to worry about paying your rent until it is posted on the first of the month. You can tell taskwarrior about this using the wait modifier. (Hint: in the following example, som is short for “start of month,” one of the shortcuts taskwarrior understands.)

$ task 4 mod wait:som
 Modifying task 4 'pay rent'.
 Modified 1 task.
$ task
 [task next]
 ID Age   P Description    Urg
  5 14min M install fedora  3.9
  1 16min   buy eggs          0
  2 16min   buy flour         0
  3 16min   bake cake         0
 4 tasks

You will no longer be able to see the pay rent task until the start of the month. You can view waiting tasks by using task waiting:

$ task waiting
 ID Age   P Wait       Remaining Due        Description
  4 18min H 2020-03-01       11h 2020-03-03 pay rent
 1 task

There are a few other modifiers you can define. Schedule and until will place a “start” date and remove a task after a date respectfully.

You may have tasks that require other tasks to be completed. To add a dependency for other tasks, use the dep modifier:

$ task
 [task next]
 ID Age   P Description    Urg
  5 30min M install fedora  3.9
  1 33min   buy eggs          0
  2 33min   buy flour         0
  3 33min   bake cake         0
 4 tasks
$ task 3 mod dep:1,2
 Modifying task 3 'bake cake'.
 Modified 1 task.
 $ task
 [task next]
 ID Age   Deps P Description    Urg
  1 33min        buy eggs          8
  2 33min        buy flour         8
  5 31min      M install fedora  3.9
  3 33min 1 2    bake cake        -5
 4 tasks

This will modify the priorities of any tasks that is blocking a task. Now buying eggs and flour is more urgent because it is preventing you from performing a task.


You can add notes to a task using task <number> annotate:

$ task 3 anno No blueberries  
Annotating task 3 'bake cake'.  
Annotated 1 task. 
$ task  [task next]  
ID Age Deps P Description                 Urg   
1 1h         buy eggs                       8   
2 1h         buy flour                      8   
5 1h       M install fedora               3.9   
3 1h  1 2    bake cake                   -4.2                        2020-02-29 No blueberries  4 tasks

Organizing tasks

Tasks can be assigned to projects and tagged by using the project modifier and adding a tag using the + sign followed by the tag name, such as +problem.

Putting it all together

You can combine everything you learned to create a task in one line with all the required options.

$ task add Write Taskwarrior post \
pri:M due:1m wait:som until:due+2w sche:15th \
project:magazine +taskwarrior +community +linux

 Created task 6.
 The project 'magazine' has changed.  Project 'magazine' is 0% complete (1 task remaining).
$ task 6
 No command specified - assuming 'information'.
 Name          Value
 ID            6
 Description   Write Taskwarrior post
 Status        Waiting
 Project       magazine
 Entered       2020-02-29 13:50:27 (6s)
 Waiting until 2020-03-01 00:00:00
 Scheduled     2020-03-15 00:00:00
 Due           2020-03-30 14:50:27
 Until         2020-04-13 14:50:27
 Last modified 2020-02-29 13:50:27 (6s)
 Tags          taskwarrior community linux
 UUID          27768737-f6a2-4515-af9d-4f58773c76a5
 Urgency        5.3
 Priority      M

Installing Taskwarrior on Fedora

Taskwarrior is available in the default Fedora repository. To install it use this command with sudo:

$ sudo dnf install task

For rpm-ostree based distributions like Fedora Silverblue:

$ sudo rpm-ostree install task 

Tips and tricks

  • Taskwarrior has a hook system, meaning that there are many tools you can plug in, such as bugwarrior!
  • Taskwarrior can connect to a taskserver for server/client setups. (This is left as an exercise for the reader for now.)

Photo by Bogdan Kupriets on Unsplash.

Tuesday, 03 March

Monday, 02 March


Demonstrating PERL with Tic-Tac-Toe, Part 2 [Fedora Magazine]

The astute observer may have noticed that PERL is misspelled. In a March 1, 1999 interview with Linux Journal, Larry Wall explained that he originally intended to include the letter “A” from the word “And” in the title “Practical Extraction And Report Language” such that the acronym would correctly spell the word PEARL. However, before he released PERL, Larry heard that another programming language had already taken that name. To resolve the name collision, he dropped the “A”. The acronym is still valid because title case and acronyms allow articles, short prepositions and conjunctions to be omitted (compare for example the acronym LASER).

Name collisions happen when distinct commands or variables with the same name are merged into a single namespace. Because Unix commands share a common namespace, two commands cannot have the same name.

The same problem exists for the names of global variables and subroutines within programs written in languages like PERL. This is an especially significant problem when programmers try to collaborate on large software projects or otherwise incorporate code written by other programmers into their own code base.

Starting with version 5, PERL supports packages. Packages allow PERL code to be modularized with unique namespaces so that the global variables and functions of the modularized code will not collide with the variables and functions of another script or module.

Shortly after its release, PERL5 software developers all over the world began writing software modules to extend PERL’s core functionality. Because many of those developers (currently about 15,000) have made their work freely available on the Comprehensive Perl Archive Network (CPAN), you can easily extend the functionality of PERL on your PC so that you can perform very advanced and complex tasks with just a few commands.

The remainder of this article builds on the previous article in this series by demonstrating how to install, use and create PERL modules on Fedora Linux.

An example PERL program

See the example program from the previous article below, with a few lines of code added to import and use some modules named chip1, chip2 and chip3. It is written in such a way that the program should work even if the chip modules cannot be found. Future articles in this series will build on the below script by adding the additional modules named chip2 and chip3.

You should be able to copy and paste the below code into a plain text file and use the same one-liner that was provided in the previous article to strip the leading numbers.

00 #!/usr/bin/perl
02 use strict;
03 use warnings;
05 use feature 'state';
07 use constant MARKS=>[ 'X', 'O' ];
08 use constant HAL9K=>'O';
09 use constant BOARD=>'
10 ┌───┬───┬───┐
11 │ 1 │ 2 │ 3 │
12 ├───┼───┼───┤
13 │ 4 │ 5 │ 6 │
14 ├───┼───┼───┤
15 │ 7 │ 8 │ 9 │
16 └───┴───┴───┘
17 ';
19 use lib 'hal';
20 use if -e 'hal/', 'chip1';
21 use if -e 'hal/', 'chip2';
22 use if -e 'hal/', 'chip3';
24 sub get_mark {
25    my $game = shift;
26    my @nums = $game =~ /[1-9]/g;
27    my $indx = (@nums+1) % 2;
29    return MARKS->[$indx];
30 }
32 sub put_mark {
33    my $game = shift;
34    my $mark = shift;
35    my $move = shift;
37    $game =~ s/$move/$mark/;
39    return $game;
40 }
42 sub get_move {
43    return (<> =~ /^[1-9]$/) ? $& : '0';
44 }
46 PROMPT: {
47    no strict;
48    no warnings;
50    state $game = BOARD;
52    my $mark;
53    my $move;
55    print $game;
57    if (defined &get_victor) {
58       my $victor = get_victor $game, MARKS;
59       if (defined $victor) {
60          print "$victor wins!\n";
61          complain if ($victor ne HAL9K);
62          last PROMPT;
63       }
64    }
66    last PROMPT if ($game !~ /[1-9]/);
68    $mark = get_mark $game;
69    print "$mark\'s move?: ";
71    if ($mark eq HAL9K and defined &hal_move) {
72       $move = hal_move $game, $mark, MARKS;
73       print "$move\n";
74    } else {
75       $move = get_move;
76    }
77    $game = put_mark $game, $mark, $move;
79    redo PROMPT;
80 }

Once you have the above code downloaded and working, create a subdirectory named hal under the same directory that you put the above program. Then copy and paste the below code into a plain text file and use the same procedure to strip the leading numbers. Name the version without the line numbers and move it into the hal subdirectory.

00 # basic operations chip
02 package chip1;
04 use strict;
05 use warnings;
07 use constant MAGIC=>'
08 ┌───┬───┬───┐
09 │ 2 │ 9 │ 4 │
10 ├───┼───┼───┤
11 │ 7 │ 5 │ 3 │
12 ├───┼───┼───┤
13 │ 6 │ 1 │ 8 │
14 └───┴───┴───┘
15 ';
17 use List::Util 'sum';
18 use Algorithm::Combinatorics 'combinations';
20 sub get_moves {
21    my $game = shift;
22    my $mark = shift;
23    my @nums;
25    while ($game =~ /$mark/g) {
26       push @nums, substr(MAGIC, $-[0], 1);
27    }
29    return @nums;
30 }
32 sub get_victor {
33    my $game = shift;
34    my $marks = shift;
35    my $victor;
37    TEST: for (@$marks) {
38       my $mark = $_;
39       my @nums = get_moves $game, $mark;
41       next unless @nums >= 3;
42       for (combinations(\@nums, 3)) {
43          my @comb = @$_;
44          if (sum(@comb) == 15) {
45             $victor = $mark;
46             last TEST;
47          }
48       }
49    }
51    return $victor;
52 }
54 sub hal_move {
55    my $game = shift;
56    my @nums = $game =~ /[1-9]/g;
57    my $rand = int rand @nums;
59    return $nums[$rand];
60 }
62 sub complain {
63    print "Daisy, Daisy, give me your answer do.\n";
64 }
66 sub import {
67    no strict;
68    no warnings;
70    my $p = __PACKAGE__;
71    my $c = caller;
73    *{ $c . '::get_victor' } = \&{ $p . '::get_victor' };
74    *{ $c . '::hal_move' } = \&{ $p . '::hal_move' };
75    *{ $c . '::complain' } = \&{ $p . '::complain' };
76 }
78 1;

The first thing that you will probably notice when you try to run the program with in place is an error message like the following (emphasis added):

$ Can't locate Algorithm/ in @INC (you may need to install the Algorithm::Combinatorics module) (@INC contains: hal /usr/local/lib64/perl5/5.30 /usr/local/share/perl5/5.30 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at hal/ line 17.
BEGIN failed--compilation aborted at hal/ line 17.
Compilation failed in require at /usr/share/perl5/ line 15.
BEGIN failed--compilation aborted at game line 18.

When you see an error like the one above, just use the dnf command to search Fedora’s package repository for the name of the system package that provides the needed PERL module as shown below. Note that the module name and path from the above error message have been prefixed with */ and then surrounded with single quotes.

$ dnf provides '*/Algorithm/'
perl-Algorithm-Combinatorics-0.27-17.fc31.x86_64 : Efficient generation of combinatorial sequences
Repo        : fedora
Matched from:
Filename    : /usr/lib64/perl5/vendor_perl/Algorithm/

Hopefully it will find the needed package which you can then install:

$ sudo dnf install perl-Algorithm-Combinatorics

Once you have all the needed modules installed, the program should work.

How it works

This example is admittedly quite contrived. Nothing about Tic-Tac-Toe is complex enough to need a CPAN module. To demonstrate installing and using a non-standard module, the above program uses the combinations library routine from the Algorithm::Combinatorics module to generate a list of the possible combinations of three numbers from the provided set. Because the board numbers have been mapped to a 3×3 magic square, any set of three numbers that sum to 15 will be aligned on a column, row or diagonal and will therefore be a winning combination.

Modules are imported into a program with the use and require commands. The only difference between them is that the use command automatically calls the import subroutine (if one exists) in the module being imported. The require command does not automatically call any subroutines.

Modules are just files with a .pm extension that contain PERL subroutines and variables. They begin with the package command and end with 1;. But otherwise, they look like any other PERL script. The file name should match the package name. Package and file names are case sensitive.

Beware that when you are reading online documentation about PERL modules, the documentation often veers off into topics about classes. Classes are built on modules, but a simple module does not have to adhere to all the restrictions that apply to classes. When you start seeing words like method, inheritance and polymorphism, you are reading about classes, not modules.

There are two subroutine names that are reserved for special use in modules. They are import and unimport and they are called by the use and no directives respectively.

The purpose of the import and unimport subroutines is typically to alias and unalias the module’s subroutines in and out of the calling namespace respectively. For example, line 17 of shows the sum subroutine being imported from the List::Util module.

The constant module, as used on lines 07 of, is also altering the caller’s namespace (chip1), but rather than importing a predefined subroutine, it is creating a special type of variable.

All the identifiers immediately following the use keywords in the above examples are modules. On my system, many of them can be found under the /usr/share/perl5 directory.

Notice that the above error message states “@INC contains:” followed by a list of directories. INC is a special PERL variable that lists, in order, the directories from which modules should be loaded. The first file found with a matching name will be used.

As demonstrated on line 19 of the Tic-Tac-Toe game, the lib module can be used to update the list of directories in the INC variable.

The chip1 module above provides an example of a very simple import subroutine. In most cases you will want to use the import subroutine that is provided by the Exporter module rather than implementing your own. A custom import subroutine is used in the above example to demonstrate the basics of what it does. Also, the custom implementation makes it easy to override the subroutine definitions in later examples.

The import subroutine shown above reveals some of the hidden magic that makes packages work. All variables that are both globally scoped (that is, created outside of any pair of curly brackets) and dynamically scoped (that is, not prefixed with the keywords my or state) and all global subroutines are automatically prefixed with a package name. The default package name if no package command has been issued is main.

By default, the current package is assumed when an unqualified variable or subroutine is used. When get_move is called from the PROMPT block in the above example, main::get_move is assumed because the PROMPT block exists in the main package. Likewise, when get_moves is called from the get_victor subroutine, chip1::get_moves is assumed because get_victor exists in the chip1 package.

If you want to access a variable or subroutine that exists in a different package, you either have to use its fully qualified name or create a local alias that refers to the desired subroutine.

The import subroutine shown above demonstrates how to create subroutine aliases that refer to subroutines in other packages. On lines 73-75, the fully qualified names for the subroutines are being constructed and then the symbol table name for the subroutine in the calling namespace (the package in which the use statement is being executed) is being assigned the reference of the subroutine in the local package (the package in which the import subroutine is defined).

Notice that subroutines, like variables, have sigils. The sigil for subroutines is the ampersand symbol (&). In most contexts, the sigil for subroutines is optional. When working with references (as shown on lines 73-75 of the import subroutine) and when checking if a subroutine is defined (as shown on lines 57 and 71 of the PROMPT block), the sigil for subroutines is required.

The import subroutine shown above is just a bare minimum example. There is a lot that it doesn’t do. In particular, a proper import subroutine would not automatically import any subroutines or variables. Normally, the user would be expected to provide a list of the routines to be imported on the use line and that list is available to the import subroutine in the @_ array.

Final notes

Lines 25-27 of provide a good example of PERL’s dense notation problem. With just a couple of lines code, the board numbers on which a given mark has been placed can be determined. But does the statement within the conditional clause of the while loop perform the search from the beginning of the game variable on each iteration? Or does it continue from where it left off each time? PERL correctly guesses that I want it to provide the position ($-[0]) of the next mark, if any exits, on each iteration. But exactly what it will do can be very difficult to determine just by looking at the code.

The last things of note in the above examples are the strict and warnings directives. They enable extra compile-time and runtime debugging messages respectively. Many PERL programmers recommend always including these directives so that programming errors are more likely to be spotted. The downside of having them enabled is that some complex code will sometimes cause the debugger to erroneously generate unwanted output. Consequently, the strict and/or warnings directives may need to be disabled in some code blocks to get your program to run correctly as demonstrated on lines 67 and 68 of the example chip1 module. The strict and warnings directives have nothing to do with the program and they can be omitted. Their only purpose is to provide feedback to the program developer.

Sunday, 01 March


Supporting Spark as a First-Class Citizen in Yelp’s Computing Platform [Yelp Engineering and Product Blog]

Yelp extensively utilizes distributed batch processing for a diverse set of problems and workflows. Some examples include: Computation over Yelp’s review corpus to identify restaurants that have great views Training ML models to predict personalized business collections for individual users Analytics to extract the most in-demand service offerings for Request a Quote projects On-demand workloads to investigate surges in bot traffic so we can quickly react to keep Yelp safe Over the past two years, Yelp engineering has undertaken a series of projects to consolidate our batch processing technologies and standardize on Apache Spark. These projects aimed to simultaneously accelerate...

Friday, 28 February


Fedora’s gaggle of desktops [Fedora Magazine]

There are 38 different desktops or window managers in Fedora 31. You could try a different one every day for a month, and still have some left over. Some have very few features. Some have so many features they are called a desktop environment. This article can’t go into detail on each, but it’s interesting to see the whole list in one place.

Criteria for desktops

To be on this list, the desktop must show up on the desktop manager’s selection list. If the desktop has more than one entry in the desktop manager list, they are counted just as that one desktop. An example is “GNOME”, “GNOME Classic” and “GNOME (Wayland).” These all show up on the desktop manager list, but they are still just GNOME.

List of desktops


Emulation of the Plan 9 window manager 8 1/2 dnf install 9wm


Highly configurable, framework window manager for X. Fast, light and extensible dnf install awesome


Very small and fast Window Manager Fedora uses the maintained fork on github dnf install blackbox


A tiling window manager based on binary space partitioning dnf install bspwm


Light-weight, configurable window manager built upon GNU screen dnf install byobu


Cinnamon provides a desktop with a traditional layout, advanced features, easy to use, powerful and flexible. dnf group install "Cinnamon Desktop"


Calm Window Manager by OpenBSD project dnf install cwm


Deepin desktop is the desktop environment released with deepin (the linux distribution). It aims at being elegant and easy to use. dnf group install "Deepin Desktop" (optional) dnf group install "Deepin Desktop Office" "Media packages for Deepin Desktop"


Dynamic window manager for X dnf install dwm (optional) dnf install dwm-user


Enlightenment window manager dnf install enlightenment


The Enlightenment window manager, DR16 dnf install e16 (optional) dnf install e16-epplets e16-keyedit e16-themes


Window Manager based on Blackbox dnf install fluxbox (optional) dnf install fluxbox-pulseaudio fluxbox-vim-syntax


Highly configurable multiple virtual desktop window manager dnf install fvwm


GNOME is a highly intuitive and user friendly desktop environment. * both X11 and wayland dnf group install "GNOME" (optional but large) dnf group install "Fedora Workstation"


A manual tiling window manager dnf install herbstluftwm (optional) dnf install herbstluftwm-zsh herbstluftwm-fish


Improved tiling window manager dnf install i3 (optional) dnf install i3-doc i3-ipc


Window manager designed for speed, usability, and consistency dnf install icewm (optional) dnf install icewm-minimal-session


Joe's Window Manager dnf install jwm

KDE Plasma Desktop

The KDE Plasma Workspaces, a highly-configurable graphical user interface which includes a panel, desktop, system icons and desktop widgets, and many powerful KDE applications. * both X11 and wayland dnf group install "KDE Plasma Workspaces" (optional) dnf group install "KDE Applications" "KDE Educational applications" "KDE Multimedia support" "KDE Office" "KDE Telepathy" (optional for wayland) dnf install kwin-wayland plasma-workspace-wayland


A lightweight, portable desktop environment dnf install lumina-desktop (optional) dnf install lumina-*


LXDE is a lightweight X11 desktop environment designed for computers with low hardware specifications like netbooks, mobile devices or older computers. dnf group install "LXDE Desktop" (optional) dnf group install "LXDE Office" "Multimedia support for LXDE"


LXQt is a lightweight X11 desktop environment designed for computers with low hardware specifications like netbooks, mobile devices or older computers. dnf group install "LXQt Desktop" (optional) dnf group install "LXQt Office" "Multimedia support for LXQt"


MATE Desktop is based on GNOME 2 and provides a powerful graphical user interface for users who seek a simple easy to use traditional desktop interface. dnf group install "MATE Desktop" (optional) dnf group install "MATE Applications"


A simple dynamic window manager fox X dnf install musca


A highly configurable and standards-compliant X11 window manager dnf install openbox (optional) dnf install openbox-kde openbox-theme-mistral-thin-dark


The Pantheon desktop environment is the DE that powers elementaryOS. dnf group install "Pantheon Desktop" (optional) dnf install elementary-capnet-assist elementary-greeter elementary-shortcut-overlay


A small and flexible window manager dnf install pekwm


A pure-Python tiling window manager dnf install qtile


Minimalistic window manager dnf install ratpoison


An extensible window manager for the X Window System dnf install sawfish (optional) dnf install sawfish-pager


Minimalist tiling window manager written in C dnf install spectrwm


A software playground for learning about learning. * Possibly the most unique desktop of this list. dnf group install "Sugar Desktop Environment" (optional) dnf group install "Additional Sugar Activities"


i3-compatible window manager for Wayland * Wayland only dnf install sway


X.Org X11 twm window manager dnf install xorg-x11-twm


A fast, feature rich Window Manager dnf install WindowMaker (optional) dnf install WindowMaker-extra


A really simple window manager for X dnf install wmx


A lightweight desktop environment that works well on low end machines. dnf group install "Xfce Desktop" (optional) dnf group install "Applications for the Xfce Desktop" "Extra plugins for the Xfce panel" "Multimedia support for Xfce" "Xfce Office"


A tiling window manager dnf install xmonad (optional) dnf install xmonad-mate

Photo by Annie Spratt on Unsplash.

Thursday, 27 February

Wednesday, 26 February


The Causal Analysis of Cannibalization in Online Products [Code as Craft]


Nowadays an internet company typically has a wide range of online products to fulfill customer needs.  It is common for users to interact with multiple online products on the same platform and at the same time.  Consider, for example, Etsy’s marketplace. There are organic search, recommendation modules (recommendations), and promoted listings enabling users to find interesting items.  Although each of them offers a unique opportunity for users to interact with a portion of the overall inventory, they are functionally similar and contest the limited time, attention, and monetary budgets of users.

To optimize users’ overall experiences, instead of understanding and improving these products separately, it is important to gain insights into the evidence of cannibalization: an improvement in one product induces users to decrease their engagement with other products.  Cannibalization is very difficult to detect in the offline evaluation, while frequently shows up in online A/B tests.

Consider the following example, an A/B test for a recommendation module.  A typical A/B test of a recommendation module commonly involves the change in the underlying machine learning algorithm, its user interface, or both.  The recommendation change significantly increased users’ clicks on the recommendation while significantly decreasing users’ clicks on organic search results.

Table 1: A/B Test Results for Recommendation Module
(Simulated Experiment Data to Imitate the Real A/B Test)

% Change = Effect/Mean of Control
Recommendation Clicks+28%***
Search Clicks-1%***

Note: ‘***’ p<0.001, ‘**’ p<0.01, ‘*’ p<0.05, ‘.’ p<0.1.  The two-tailed p-value is derived from the z-test for H0: the effect is zero, which is based on asymptotic normality.

There is an intuitive explanation to the drop in search clicks: users might not need to search as much as usual because they could find what they were looking for through recommendations.  In other words, improved recommendations effectively diverted users’ attention away from search and thus cannibalized the user engagement in search.

Note that increased recommendation clicks did not translate into observed gains in key performance indicators: conversion and Gross Merchandise Sales (GMS).  Conversion and GMS are typically measured at the sitewide level because the ultimate goal of the improvement of any product on our platform is to facilitate a better user experience about  The launch decision of a new algorithm is usually based on the significant gain in conversion/GMS from A/B tests. The insignificant conversion/GMS gain and the significant lift in recommendation clicks challenge product owners when deciding to terminate the new algorithm.  They wonder whether the cannibalization in search clicks could, in turn, cannibalize conversion/GMS gain from recommendation. In other words, it is plausible that the improved recommendations should have brought more significant increases of conversion/GMS than what the A/B test shows, and its positive impact is partially offset by the negative impact from the cannibalized user engagement in search.  If there is cannibalization in conversion/GMS gain, then, instead of terminating it, it is advisable to launch the new recommendation algorithm and revise the search algorithm to work better with the new recommendation algorithm; otherwise, the development of recommendation algorithms would be hurt.

The challenge asks for separating the revenue loss (through search) from the original revenue gain (from the recommendation module change).  Unfortunately, from the A/B tests, we can only observe the cannibalization in user engagement (the induced reduction in search clicks).

Flaws of Purchase-Funnel Based Attribution Metrics

Specific product revenue is commonly attributed based on purchase-funnel/user-journey.  For example, the purchase-funnel of recommendations could be defined as a sequence of user actions: “click A in recommendations → purchase A”.  To compute recommendation-attributed conversion rate, we have to segment all the converted users into two groups: those who follow the pattern of the purchase-funnel and those who do not.  Only the first segment is used for counting the number of conversions.

However, the validity of the attribution is questionable.  In many A/B tests of new recommendation algorithms, it is common for recommendation-attributed revenue change to be over +200% and search-attributed revenue change to be around -1%.  It is difficult to see how the conversion lift is cannibalized and dropped from +200% to the observed +0.2%. These peculiar numbers remind us that attribution metrics based on purchase-funnel are unexplainable and unreliable for at least two reasons.

First, users usually take more complicated journeys than a heuristically-defined purchase-funnel can capture.  Here are two examples:

  1. If the recommendations make users stay longer on Etsy, and users click listings on other pages and modules to make purchases, then the recommendation-attributed metrics fail to capture the contribution of the recommendations to these conversions.  The purchase-funnel is based on “click”, and there is no way to incorporate “dwell time” to the purchase-funnel.
  2. Suppose the true user journey is “click A in recommendation → search A → click A in search results → click A in many other places → purchase A”.  Shall the conversion be attributed to recommendation or search? Shall all the visited pages and modules share the credit of this conversion? Any answer would be too heuristic to be convincing.

Second, attribution metrics cannot measure any causal effects.  The random assignment of users in an A/B test makes treatment and control buckets comparable and thus enables us to calculate the average treatment effect (ATE).  The segments of users who follow the pattern of purchase-funnel may not be comparable between the two buckets, because the segmentation criterion (i.e., user journey) happens after random assignment and thus the segments of users are not randomized between the two buckets.  In causal inference, factors that cause users to follow the pattern of purchase-funnel would be introduced by the segmentation and thus confound the causality between treatment and outcome. Any post-treatment segmentation could break the ignorability assumption of the causal identification and invalidate the causality in experiment analysis (see, e.g., Montgomery et al., 2018).

Causal Mediation Analysis

We exploit search clicks as a mediator in the causal path between recommendation improvement and conversion/GMS, and extend a formal framework of causal inference, causal mediation analysis (CMA), to separate the cannibalized effect from the original effect of the recommendation module change.  CMA splits the observed conversion/GMS gains (average treatment effect, ATE) in A/B tests into the gains from the recommendation improvement (direct effect) and the losses due to cannibalized search clicks (indirect effect). In other words, the framework allows us to measure the impacts of recommendation improvement on conversion/GMS directly as well as indirectly through a mediator such as search (Figure 1).  The significant drop in search clicks makes it a good candidate for the mediator. In practice, we can try different candidate mediators and use the analysis to confirm which one is the mediator.

Figure 1: Directed Acyclic Graph (DAG) to illustrate the causal mediation in recommendation A/B test.

However, it is challenging to implement CMA of the literature directly in practice.  An internet platform typically has tons of online products and all of them could be mediators on the causal path between the tested product and the final business outcomes.  Figure 2 shows that multiple mediators (M1, M0, and M2) are on the causal path between treatment T and the final business outcome Y. In practice, it is very difficult to measure user engagement in all these mediators. Multiple unmeasured causally-dependent mediators in A/B tests break the sequential ignorability assumption in CMA and invalidates CMA (see Imai et al. (2010) for assumptions in CMA).

Figure 2: DAG of Multiple Mediators
Note: M0 and M2 are upstream and downstream mediators of the mediator M1 respectively.

We define generalized average causal mediation effect (GACME) and generalized average direct effect (GADE) to analyze the second cannibalism.  GADE captures the average causal effect of the treatment T that goes through all the channels that do not have M1. GACME captures the average causal effect of the treatment T that goes through all the channels that have M1.  We proved that, under some assumptions, GADE and GACME are identifiable even though there are numerous unmeasured causally-dependent mediators. If there is no unmeasured mediator, then GADE and GACME collapse to ADE and ACME.  If there is, then ADE and ACME cannot be identified while GADE and GACME can.

Table 2 shows the sample results.  The recommendation improvement led to a 0.5% conversion lift, but the cannibalized search clicks resulted in a 0.3% conversion loss, and the observed revenue did not change significantly.  When the outcome is GMS, we can see the loss through cannibalized search clicks as well. The results justify the cannibalization in conversion lift, and serve as evidence to support the launch of the new recommendation module.

Table 2: Causal Mediation Analysis on Simulated Experiment Data

% Change = Effect/Mean of Control
Cannibalization in GainCausal MediationConversionGMS
The Original Gain from recommendationGADE(0) (Direct Component)0.5%*0.2%
The Loss Through SearchGACME(1) (Indirect Component)-0.3***-0.4%***
The Observed GainATE (Total Effect)0.2%-0.3%

Note: ‘***’ p<0.001, ‘**’ p<0.01, ‘*’ p<0.05, ‘.’ p<0.1.  The two-tailed p-value is derived from the z-test for H0: the effect is zero, which is based on asymptotic normality.

The implementation follows a causal mediation-based methodology we recently developed and published on KDD 2019. We also made a fun video describing the intuition behind the methodology.  It is easy to implement and only requires solving two linear regression equations simultaneously (Section 4.4).  We simply need the treatment assignment indicator, search clicks, and the observed revenue for each experimental unit.  Interested readers can refer to our paper for more details and our GitHub repo for analysis code.

We have successfully deployed our model to identify products that are prone to cannibalization.  In particular, it has helped product and engineering teams understand the tradeoffs between search and recommendations, and focus on the right opportunities.  The direct effect on revenue is a more informative key performance indicator than the observed average treatment effect to measure the true contribution of a product change to the marketplace and to guide the decision on the launch of new product features.