Sunday, 21 July

17:58

6.6 Million Lose CBS Channels After 'Business Dispute' With AT&T [Slashdot]

"Media giants are embroiled in yet another fight over TV rates, and viewers are once again paying the price," writes Engadget. CBS' channels in 17 markets (including New York, San Francisco and Atlanta) have gone dark on AT&T services like DirecTV Now and U-verse after the two companies failed to reach an agreement on a new carriage contract before the old one expired at 2AM ET on July 19th. As is often the case in disputes like this, the two sides are each accusing each other of being unreasonable -- though AT&T in particular has also claimed that CBS is using All Access as a weapon. CNET notes that the dispute also affects 100 CBS stations and affiliates on Direct Now, citing reports that it ultimately impacts a total of 6.6 million TV viewers in the U.S. "A business dispute took CBS off the air for millions of satellite television customers of DirecTV and AT&T U-verse on Saturday," according to a news report (from CBS): CBS said that while it didn't want its customers caught in the middle, it is determined to fight for fair value... AT&T countered in a statement provided to Variety that CBS is "a repeat blackout offender" that has pulled its programming from other carriers before in order to get its way. "Isn't this the sort of thing they enemies of net neutrality assured us would never happen?" writes long-time Slashdot reader shanen. "Or is it just a plot to sell VPN services?"

Read more of this story at Slashdot.

16:54

When Online Teachers See Child Abuse [Slashdot]

Rick Zeman (Slashdot reader #15,628) shares "a thought-provoking article on when online English teachers see child abuse at the other end of their cameras." Of the 24 online teachers interviewed, about two thirds told "harrowing" stories, EdSurge reports, and within the teachers' Facebook groups new reports "surface nearly every week." The teachers post in these private Facebook groups because they aren't sure how to process, much less report, what they saw. They ask one another the same few questions in many different ways: Has this ever happened to you? Is what I'm feeling normal? How should I respond? Will the company do something about it? One company employs 70,000 online teachers who reach more than 600,000 children in China -- yet one of its teachers complains that the company offered her no guidance for these situations. After saying they "take these matters very seriously" (with a procedure in place for these "very rare" instances), that company declined repeated requests for further interviews "and would not elaborate on its procedure for referring reports of abuse to local agencies." (Even though in China, as in the U.S., the described behavior is illegal.) One China-born anthropologist says that many parents may not even be aware of a 2015 law which banned domestic abuse against children. Last month another company advised its teachers that those who do report incidents will not receive any follow-up from the company, for reasons of "student confidentiality" -- though "We assure you that our teams will address any concerns in a prudent manner."

Read more of this story at Slashdot.

16:16

Linux 5.3-rc1 Debuts As "A Pretty Big Release" [Phoronix]

Just as expected, Linus Torvalds this afternoon issued the first release candidate of the forthcoming Linux 5.3 kernel...

15:34

Google Settles Age Descrimination Lawsuit [Slashdot]

Long-time Slashdot reader sfcat quotes Forbes: Almost a decade ago, courts sounded a clear warning bell that Google's culture was tainted by illegal and pervasive age discrimination. Inexplicably, Google didn't listen. And so the Los Angeles Times recently reported that Google has agreed to pay $11 million to settle a federal lawsuit alleging Google engaged in a systemic practice of discriminating on the basis of age in hiring. Some 227 plaintiffs will collect an average of $35,000 each. Google actually agreed to settle the case in December but the final settlement agreement was presented to a federal judge on Friday. The lawsuit was filed by Cheryl Fillekes, a software engineer who was interviewed by Google four times from 2007 to 2014, starting when she was 47, but was never hired. The lawsuit alleged Google hired younger workers based on "cultural fit." In the settlement Google also agrees to train its managers about age bias and create an "age diversity in recruiting" committee. Forbes points out that the median age for all Google employees in 2017 was 30, "a decade younger than the median age of U.S. workers." "On its web page, Google says its mission is to 'organize the world's information and make it universally accessible and useful.' But for some reason Google has failed as a company to organize and use the information that age discrimination is illegal."

Read more of this story at Slashdot.

14:34

Comic-Con Trailers Include 'Star Trek: Picard' and HBO's 'Watchmen' Series [Slashdot]

"At Comic-Con, Sir Patrick Stewart took to the Hall H stage Saturday afternoon to discuss his new series, Star Trek: Picard," reports CBS News: The series will focus on what caused famed captain and admiral Jean-Luc Picard to leave Starfleet, and his life since.... Patrick Stewart -- who is also an executive producer -- answered questions about the show. "We never know, do we, when our best moment will be. And that is now," Stewart said. "I knew something unusual would happen. I knew I needed to be a part of it." Stewart has been heavily involved in crafting "Star Trek: Picard" and frequently visits the writer's room... Brent Spiner, who played the character Data on TNG, said there was "no way" he could say no to the opportunity to work with Stewart again.... The show is set 20 years after the events of "Star Trek: The Next Generation" around the year 2399. This sets the series further into the future than any previous Star Trek series. But fans should not expect to see the same Jean-Luc Picard they know from "The Next Generation" series. During the press tour, Kurtzman teased that the show will be very different and "grounded." The series will explore how Picard has changed in that time, making him reckon with the choices he has made. Kurtzman hinted that there are circumstances that have "radically" shifted that have caused the beloved Starfleet admiral to question his life decisions. The two-minute trailer includes a surprising cameo, and Variety reports that CBS has also committed to two seasons of Star Trek: Lower Decks, an animated series focused on "the support crew serving on one of Starfleet's least important ships." (They also report that Seth MacFarlane announced season 3 of The Orville will be moving from Fox to Hulu.) Also at Comic-Con, HBO shared the first full trailer for their upcoming Watchmen TV series, a sequel to the original Alan Moore graphic novel. Rolling Stone quotes HBO as saying that Watchmen "takes place in an alternative, contemporary reality in the United States, in which masked vigilantes became outlawed due to their violent methods." Marvel also revealed that their next Thor movie (Thor: Love and Thunder) will incude both Chris Hemsworth and Natalie Portman as Lady Thor, and shared footage from their upcoming Black Widow movie. And CNET has a comprehensive rundown (with trailers) of all the DC Comics superhero shows on the CW network, including Arrow, Supergirl, The Flash, Black Lightning, and Batwoman.

Read more of this story at Slashdot.

14:03

It's a facial-recognition bonanza: Oakland bans it, activists track it, and pics taken from dating-site OkCupid feed it [The Register]

Watching us, watching you

Roundup  Hello, welcome to this week’s roundup of news in the ever encroaching world of AI and machine learning. We’ll be talking about everyone’s favorite topic at the moment: facial recognition.…

14:00

Vulkan 1.1.116 Published With Subgroup Size Control Extension [Phoronix]

Vulkan 1.1.116 was released today as the latest weekly update to this high performance graphics API and comes with one new extension in tow...

13:34

Atlassian Changes Annual Performance Reviews To Stop Rewarding 'Brilliant Jerks' [Slashdot]

Australia-based Atlassian"has implemented a new performance review strategy designed to give their workers a better evaluation of how they're performing," reports Business Insider, adding that Atlassian's global head of talent said the company wants to measure contributions to a larger team effort. "We want people to get rewarded for what they delivered." In 2018 it soft-launched a strategy where most of its performance review process will have nothing to do with the skills in an employee's job, but more to do with how well they are living with the company values. Now, the strategy is being rolled out permanently and will be tied to employee bonuses... "We want to be able to evaluate a whole person and encourage them to bring their full self to work and not just focus on skills itself, but really focus on the way they do their work," said Bek Chee, Atlassian's global head of talent. She added that while workforces have changed over the past 30 years, performance reviews, for the most part, have stayed the same... With this performance review system, Atlassian aims to throw out the idea of the "brilliant jerk", which Chee describes as someone who is technically-talented, but perhaps at the expense of others. Instead it is focusing on how an employee demonstrates the company values, how they complete their roles and how they contribute to their team. "We really want to enforce the way that values get lived, the way that people impact the team and the way that they also contribute within their role.

Read more of this story at Slashdot.

12:34

Can We Use Special Sails To Bring Old Satellites Back Down To Earth? [Slashdot]

There's already nearly 5,000 satellites orbiting earth, "and many of them are non-functioning space debris now, clogging up orbital paths for newer satellites," reports Universe Today. Yet over the next five years we expect to launch up to 2600 more -- which is prompting a search for solutions to "the growing problem of space debris in Low-Earth Orbit." Some exotic-sounding solutions involve harpoons, nets, magnets, even lasers. Now NASA has given Purdue University-related startup Vestigo Aerospace money for a six month study that looks at using drag sails to de-orbit space junk, including satellites, spent rocket boosters, and other debris, safely...Drag sails are a bit different than other methods. While the harpoons, lasers, and nets proposed by various agencies are meant to deal with the space junk that's already accumulated, drag sails are designed to be built into a satellite and deployed at the end of their useful life... Once deployed, they would reduce an object's velocity and then help it deorbit safely. Currently, satellites deorbit more or less on their own terms, and it's difficult to calculate where they may strike Earth, if they're too large to burn up on re-entry... [D]rag sails offer an affordable, and potentially easy-to-develop method to ensure future satellites don't outlive their usefulness. The company was started by a Purdue associate professor of engineering who tells the site they're building in scalability, so their sails can handle satellites that weigh one kilogram -- or one ton.

Read more of this story at Slashdot.

11:34

'Fortnite' Creator Epic Games Supports Blender Foundation With $1.2 Million [Slashdot]

Long-time Slashdot reader dnix writes: Apparently having a lot of people playing Fortnite is good for the open source community too. Epic Games' MegaGrants program just awarded the Blender Foundation with $1.2 million over the next three years...to further the success of the free and open source 3D creation suite. It's part of the company's $100 million "MegaGrants" program, according to the announcement. "Open tools, libraries and platforms are critical to the future of the digital content ecosystem," said Tim Sweeney, founder and CEO of Epic Games. "Blender is an enduring resource within the artistic community, and we aim to ensure its advancement to the benefit of all creators."

Read more of this story at Slashdot.

10:34

GitLab Survey Finds Positive Results For Both DevOps and Working Remotely [Slashdot]

GitLab's CEO and co-founder says there was one big takeaway from their recent "2019 Global Developer Report: DevSecOps": that early adopters of a strong Devops model experience greater security. "Security teams in a longstanding DevOps environment reported they are three times more likely to discover bugs before code is merged," according to the GitLab blog, "and 90% more likely to test between 91% and 100% of code than teams who encounter early-stage DevOps." But after polling over 4,000 software professionals, the survey also found positive results from another workplace arrangement, which they report under the headline "Remote work works." According to our survey respondents, working remotely leads to greater collaboration, better documentation, and transparency. In fact, developers in a mostly remote environment are 23% more likely to have good insight into what colleagues are working on and rate the maturity of their organization's security practices 29% higher than those who work in a traditional office environment.

Read more of this story at Slashdot.

09:34

Microsoft Demos Hologram 'Holoportation' [Slashdot]

Microsoft "continues to plug away at making holoportation possible," reports ZDNet: In a new demonstration, officials showed off a scenario where a life-sized holographic representation of a person could be beamed into a scenario with real-time simultaneous language translation happening -- a communication scenario on which Microsoft has been working for years. At Microsoft's Inspire partner show (which is co-located with its Ready sales kick-off event) on July 17, Microsoft demonstrated such a scenario on stage during CEO Satya Nadella's keynote. Azure Corporate Vice President Julia White donned a HoloLens 2 headset and [demonstrated] a full-size hologram of herself translated simultaneously into Japanese and maintaining her speech cadence and patterns. [Microsoft later said that the life-sized hologram was created at Microsoft's Mixed Reality Capture Studios.] Microsoft pulled off the demo by combining a number of its existing technologies, White said, including Azure speech-to-text, Azure Speech Translation and neural text-to-speech. The text-to-speech from Azure Speech Services allows apps, tools and devices to convert text into natural human-like synthesized speech. Users can create their own custom voice unique to them. In a video of the demo, White first appears to be holding a smaller version of her hologram in the palm of her own hand. She jokingly telling the audience, "Let me introduce you to Mini-Me."

Read more of this story at Slashdot.

09:18

Saturday Morning Breakfast Cereal - Welsh [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Found this delightful story in 'The Monsters and the Critics: And Other Essays.'


Today's News:

09:04

Earth Just Had Its Hottest June On Record [Slashdot]

Layzej shared this article from the Washington Post: Boosted by a historic heat wave in Europe and unusually warm conditions across the Arctic and Eurasia, the average temperature of the planet soared to its highest level ever recorded in June. According to data released Monday by NASA, the global average temperature was 1.7 degrees Fahrenheit (0.93 Celsius) above the June norm (based on a 1951-to-1980 baseline), easily breaking the previous June record of 1.5 degrees Fahrenheit (0.82 Celsius), set in 2016, above the average. The month was punctuated by a severe heat wave that struck Western Europe in particular during the last week, with numerous all-time-hottest-temperature records falling in countries with centuries-old data sets. Notably, 13 locations in France surpassed their highest temperature ever recorded. The heat wave's highest temperature of 114.6 degrees Fahrenheit (45.9 Celsius), posted in Gallargues-le-Montueux, was 3.2 degrees above the old record, set during an infamous heat wave in July and August 2003.

Read more of this story at Slashdot.

08:49

The New Features & Improvements Of The Linux 5.3 Kernel [Phoronix]

The Linux 5.3 kernel merge window is expected to close today so here is our usual recap of all the changes that made it into the mainline tree over the past two weeks. There is a lot of changes to be excited about from Radeon RX 5700 Navi support to various CPU improvements and ongoing performance work to supporting newer Apple MacBook laptops and Intel Speed Select Technology enablement.

08:34

Unprecedented Heat Wave Near North Pole [Slashdot]

Long-time Slashdot reader Freshly Exhumed quotes the CBC: Weather watchers are focused on the world's most northerly community, which is in the middle of a record-breaking heat wave. "It's really quite spectacular," said David Phillips, Environment Canada's chief climatologist. "This is unprecedented." The weather agency confirmed that Canadian Forces Station Alert hit a record of 21 C [69.8 F] on Sunday. On Monday, the military listening post on the top of Ellesmere Island had reached 20 C [68 F] by noon and inched slightly higher later in the day. A government report in April found that Canada was warming at twice the rate of the rest of the world, and this new article points out that recently records have been beaten "not by fractions, but by large margins." For example, the Alert station's average temperature had been a cool 44.6 F, and Environment Canada's chief climatologist says a deviation of this magnitude is like the city of Toronto reaching a high of 107.6 F. "It's nothing that you would have ever seen."

Read more of this story at Slashdot.

07:34

Celo Launches Decentralized Open Source Financial Services Prototype [Slashdot]

Forbes notes that other financial transaction platforms hope to benefit from Facebook's struggles in launching its Libra cryptocurrency -- including Celo. The key value proposition of the assets running on top of the [Celo] platform is that they are immune to the wide swings in volatility that have plagued leading crypto assets in recent years. Many are designed to mirror the price movements of traditional currency, and most have names that reflect their fiat brethren, such as the Gemini Dollar. This is a critical need for the industry, as no asset will be able to serve as a currency if it does not maintain a consistent price. However, rather than being a centralized issuer that supports the price pegs with fiat held in banks, Celo has built a full-stack platform (meaning it developed the underlying blockchain and applications that run on top), that can offer an unlimited number of stablecoins all backed by cryptoassets held in reserve. Furthermore, Celo is what is known as an algorithmic-based stablecoin provider. This distinction means that rather than being a centralized entity that controls issuances and redemptions, the company employs a smart-contract based stability protocol that automatically expands or contracts the supply of its collateral reserves in a fashion similar to how the Federal Reserve adjusts the U.S. monetary supply... Additionally, a key differentiator for Celo from similar projects is that for the first time its blockchain platform allows users to send/receive money to a person's phone number, IP address, email, as well as other identifiers. This feature will be critical to the long-term success for the network because it eliminates the need for counterparties in a transaction to share their public keys with each other prior to a transaction. And now... Celo is open-sourcing its entire codebase and design after two years of development. Additionally, the company is launching the first prototype of its platform, named the Alfajores Testnet, and Celo Wallet, an Android app that will allow users to manage their accounts and send/receive payments on the testnet. This announcement and product is intended to be just the first of what will be a wide range of financial services applications designed to connect the world. Celo's investors include LinkedIn founder Reid Hoffman and Twitter/Square CEO Jack Dorsey, the article points out, as well as some of Libra's first members, "including venerated venture capital firm Andreessen Horowitz and crypto-unicorn Coinbase."

Read more of this story at Slashdot.

05:34

Google Abandons Go's try() Function Proposal, Citing 'Overwhelming' Community Response [Slashdot]

Google's Go programming language will not add a try() function in its next major version, "despite this being a major part of what was proposed," reports the Register: Error handling in Go is currently based on using if statements to compare a returned error value to nil. If it is nil, no error occurred. This requires developers to write a lot of if statements. "In general Go programs have too much code-checking errors and not enough code handling them," wrote Google principal engineer Russ Cox in an overview of the error-handling problem in Go. There was therefore a proposal to add a built-in try function which lets you eliminate many of the if statements and triggers a return from a function if an error is detected. The proposal was not for full exception handling, which is already present in Go via the panic and recover functions. That proposal has now been abandoned. Robert Griesemer, one of the original designers of Go, announced the decision in a post Tuesday... "Based on the overwhelming community response and extensive discussion here, we are marking this proposal declined ahead of schedule. As far as technical feedback, this discussion has helpfully identified some important considerations we missed, most notably the implications for adding debugging prints and analyzing code coverage. "More importantly, we have heard clearly the many people who argued that this proposal was not targeting a worthwhile problem. We still believe that error handling in Go is not perfect and can be meaningfully improved, but it is clear that we as a community need to talk more about what specific aspects of error handling are problems that we should address."

Read more of this story at Slashdot.

Go Abandons try() Function Proposal, Citing 'Overwhelming' Community Response [Slashdot]

Google's Go programming language will not add a try() function in its next major version, "despite this being a major part of what was proposed," reports the Register: Error handling in Go is currently based on using if statements to compare a returned error value to nil. If it is nil, no error occurred. This requires developers to write a lot of if statements. "In general Go programs have too much code-checking errors and not enough code handling them," wrote Google principal engineer Russ Cox in an overview of the error-handling problem in Go. There was therefore a proposal to add a built-in try function which lets you eliminate many of the if statements and triggers a return from a function if an error is detected. The proposal was not for full exception handling, which is already present in Go via the panic and recover functions. That proposal has now been abandoned. Robert Griesemer, one of the original designers of Go, announced the decision in a post Tuesday... "Based on the overwhelming community response and extensive discussion here, we are marking this proposal declined ahead of schedule. As far as technical feedback, this discussion has helpfully identified some important considerations we missed, most notably the implications for adding debugging prints and analyzing code coverage. "More importantly, we have heard clearly the many people who argued that this proposal was not targeting a worthwhile problem. We still believe that error handling in Go is not perfect and can be meaningfully improved, but it is clear that we as a community need to talk more about what specific aspects of error handling are problems that we should address."

Read more of this story at Slashdot.

04:29

Feral's GameMode 1.4 Adds Flatpak Support, Better I/O Optimization Handling [Phoronix]

Feral developers released a new version of their GameMode Linux game performance optimization daemon/client this weekend in order to allow this update to land in the upcoming Fedora Workstation 31. GameMode 1.4 offers up many features including new interfaces for allowing better GNOME integration and thus the Fedora interest in seeing this version in their autumn Linux distribution update...

04:24

KDE Plasma 5.17 Making It Simple To Display A Network's QR Code For Easy Sharing [Phoronix]

With the KDE Plasma 5.17 release, the desktop will make it easy to see a network's QR code for in turn making it super quick and simple for sharing network information with other users and devices...

02:46

Linux 5.3 Will Surprisingly Support The Newest Keyboard/Trackpads Of Apple MacBooks [Phoronix]

As a last minute surprise for the Linux 5.3 kernel merge window is support for the keyboard and trackpads on newer Apple MacBooks and MacBook Pro laptops...

01:58

Palo Alto gateway security alert, FSB hack, scourge of data-stealing web plugins, and more [The Register]

A summary of computer security news for you, delivered rapid-fire-style

Roundup  Let's catch up with all the recent infosec news beyond what we've already covered.…

01:34

The Inventor Who Fought To Get Black Box Flight Recorders Into Every Plane [Slashdot]

This week the BBC told the remarkable story of the man who invented the "black box" flight recorders -- and of all the resistance he enountered along the way. dryriver shared this summary: In 1934, a passenger plane name Miss Hobart crashed into the sea off the coast of Australia. Among those killed was Anglican missionairy Rev Hubert Warren, whose last gift to his 8 year old son David had been a crystal radio set. Young David Warren spent hours a day tinkering with the radio, eventually learning enough electronics engineering to build his own radios and sell them to other people. David Warren later grew to be a Rocket Scientist working for Australia's Aeronautical Research Laboratories. In 1953, the department loaned him to an expert panel trying to solve a costly and distressing mystery: why did the British de Havilland Comet, the world's first commercial jet airliner and the great hope of the new Jet Age, keep crashing? David Warren was confronted with a daunting problem -- how to determine from heavily deformed crashed plane fragments what had happened to the plane while it was in the air... Warren had an interesting idea -- what if every plane in the sky had a mini recorder in the cockpit...? Warren's superior did not approve of the idea and told him to stick to chemicals and fuels. When Warren got a new boss, the new boss was more sympathetic, but told him to do the R&D for it in complete secrecy. Since it wasn't a government-approved venture or a war-winning weapon, it couldn't be seen to take up lab time or money. "If I find you talking to anyone, including me, about this matter, I will have to sack you." When Warren first floated the idea of a cockpit recorder publicly, the pilots' union responded with fury, branding the recorder a snooping device, and insisted "no plane would take off in Australia with Big Brother listening." Undeterred, Warren took to his garage and invented the first "Black Box" flight recorder.

Read more of this story at Slashdot.

Saturday, 20 July

23:13

IO_uring Gets A Huge Performance Fix - Up To 755x Improvement [Phoronix]

IO_uring is designed to deliver fast and efficient I/O operations thanks to a re-designed interface introduced in Linux 5.1 with various efficiency improvements compared to the kernel's existing asynchronous I/O code. But it turns out there was a big bottleneck within the current IO_uring code up until now...

23:03

NASA Marks The 50-Year Anniversary of Man's First Steps on the Moon [Slashdot]

It's exactly one half century from that moment in time when men first walked on the moon, writes NASA administrator Jim Bridenstine. "Today, on the golden anniversary of the Apollo 11 moon landing, NASA looks back with heartfelt gratitude for the Apollo generation's trailblazing courage as we -- the Artemis generation -- prepare to take humanity's next giant leap to Mars." The lethargic lull of scientific fatalism afflicted portions of America then as it sometimes does today. There is nothing inevitable about scientific discovery nor is there a predetermined path of cutting-edge innovation. Long hours of arduous study and experimentation are required merely to glimpse a flicker of enlightenment that can lead to greater heights of human achievement... The Apollo program hastened ground-breaking technological advancements that continue to bestow benefits to modern civilization today. Flame resistant textiles, water purification systems, cordless tools, more effective dialysis machines and improvements to food preservation and medicine are just some of the innovative wonders generated during that era. Furthermore, NASA's utilization of integrated circuits on silicon chips aboard the lunar module's computer unit helped jumpstart the budding computer industry into the massive enterprise it is today. Perhaps the most enduring legacy of the Apollo missions was their ability to inspire young Americans across the country to join science, technology, engineering and math related fields of study... After more than 50 years, the benefits of human space exploration to humanity are clear. By proud example, the Apollo program taught us we cannot venture aimlessly into the uncharted territory of future discovery merely hoping to happen upon greater advancement. Technological progress is a deliberate choice made by investing in missions that will expand our limits of understanding and capability... NASA is preparing to use the lunar surface as a proving ground to perfect our scientific and technological knowledge and utilize international partnerships, as well as the growing commercial space industry. This time when we go back to the moon we are going to stay...

Read more of this story at Slashdot.

22:05

Lima Gallium3D Gets A Reworked Scheduler [Phoronix]

Landing this week in Mesa 19.2 for the Lima Gallium3D driver for Arm Mali 400/450 series hardware is a reworked GPIR register scheduler...

19:34

Is There Tension Between Developers and Security Professionals? [Slashdot]

"Everyone knows security needs to be baked into the development lifecycle, but that doesn't mean it is," writes ZDNet, reporting on a new survey they say showed that "long-standing friction between security and development teams remain." The results came from GitLab's "2019 Global Developer Report: DevSecOps" survey of over 4,000 software professionals. Nearly half of security pros surveyed, 49%, said they struggle to get developers to make remediation of vulnerabilities a priority. Worse still, 68% of security professionals feel fewer than half of developers can spot security vulnerabilities later in the life cycle. Roughly half of security professionals said they most often found bugs after code is merged in a test environment. At the same time, nearly 70% of developers said that while they are expected to write secure code, they get little guidance or help. One disgruntled programmer said, "It's a mess, no standardization, most of my work has never had a security scan." Another problem is it seems many companies don't take security seriously enough. Nearly 44% of those surveyed reported that they're not judged on their security vulnerabilities. ZDNet also cites Linus Torvalds' remarks on the Linux kernel mailing list in 2017, complaining about how security people celebrate when code is hardened against an invalid access. "[F]rom a developer standpoint, things really are not done. Not even close. From a developer standpoint, the bad access was just a symptom, and it needs to be reported, and debugged, and fixed, so that the bug actually gets corrected. So from a developer standpoint, the end point of hardening is just the starting point, and when you think you're done, we're really only getting started." Torvalds then pointed out that the user community also has a third set of entirely different expectations, adding that "the number one rule of kernel development is that 'we don't break users'. Because without users, your program is pointless, and all the development work you've done over decades is pointless... and security is pointless too, in the end." Juggling the interest of users and developers, Torvalds suggests security people should adopt "do no harm" as their mantra, and "when adding hardening features, the first step should *ALWAYS* be 'just report it'. Not killing things, not even stopping the access. Report it. Nothing else."

Read more of this story at Slashdot.

17:34

Are Millennials Spending Too Much Money On Coffee? [Slashdot]

An anonymous reader quotes the Atlantic: Suze Orman wants young people to stop "peeing" away millions of dollars on coffee. Last month, the personal-finance celebrity ignited a controversy on social media when a video she starred in for CNBC targeted a familiar villain: kids these days and their silly $5 lattes. Because brewing coffee at home is less expensive, Orman argued, purchasing it elsewhere is tantamount to flushing money away, which makes it a worthy symbol of Millennials' squandered resources... In the face of coffee shaming, young people usually point to things like student loans and housing prices as the true source of the generation's instability, not their $100-a-month cold-brew habits... Orman and her compatriots now receive widespread pushback when denigrating coffee aficionados, a change that reflects the shifting intergenerational tensions that are frequently a feature of the post-Great Recession personal-finance genre. The industry posits that many of the sweeping generational trends affecting Americans' personal stability -- student-loan debt, housing insecurity, the precarity of the gig economy -- are actually the fault of modernity's encouragement of undisciplined individual largesse. In reality, those phenomena are largely the province of Baby Boomers, whose policies set future generations on a much tougher road than their own. With every passing year, it becomes harder to sell the idea that the problems are simply with each American as a person, instead of with the system they live in. "There's a reason for this blame-the-victim talk" in personal-finance advice, the journalist Helaine Olen wrote recently. "It lets society off the hook. Instead of getting angry at the economics of our second gilded age, many end up furious with themselves." That misdirection is useful for people in power, including self-help gurus who want to sell books... [W]hen it comes to money, says Laura Vanderkam, the author of All the Money in the World: What the Happiest People Know About Getting and Spending, there are usually only a couple of things that actually make a difference in how stable people are. It's the big stuff: how much you make, how much you pay for housing, whether or not you pay for a car.

Read more of this story at Slashdot.

17:04

DXVK 1.3.1 Brings Logging Improvements, GPU Load Monitoring In The HUD [Phoronix]

Just one week after releasing DXVK 1.3, lead developer Philip Rebohle has released DXVK 1.3.1 with a few more features plus a number of bug fixes -- including performance work...

16:44

Microsoft Warns of Political Cyberattacks, Announces Free Vote-Verification Software [Slashdot]

"Microsoft on Wednesday announced that it would give away software designed to improve the security of American voting machines," reports NBC News. Microsoft also said its AccountGuard service has already spotted 781 cyberattacks by foreign adversaries targeting political organizations -- 95% of which were located in the U.S. The company said it was rolling out the free, open-source software product called ElectionGuard, which it said uses encryption to "enable a new era of secure, verifiable voting." The company is working with election machine vendors and local governments to deploy the system in a pilot program for the 2020 election. The system uses an encrypted tracking code to allow a voter to verify that his or her vote has been recorded and has not been tampered with, Microsoft said in a blog post... Edward Perez, an election security expert with the independent Open Source Election Technology Institute, said Microsoft's move signals that voting systems, long a technology backwater, are finally receiving attention from the county's leading technical minds. "We think that it's good when a technology provider as significant as Microsoft is stepping into something as nationally important as election security," Perez told NBC News. "ElectionGuard does provide verification and it can help to detect attacks. It's important to note that detection is different from prevention." Microsoft also said its notified nearly 10,000 customers that they've been targeted or compromised by nation-state cyberattacks, according to the article -- mostly from Russia, Iran, and North Korea. "While many of these attacks are unrelated to the democratic process," Microsoft said in a blog post, "this data demonstrates the significant extent to which nation-states continue to rely on cyberattacks as a tool to gain intelligence, influence geopolitics, or achieve other objectives."

Read more of this story at Slashdot.

15:44

'Super Mario Maker 2' Finally Acknowledges Nintendo Fan Communities [Slashdot]

It was the best-selling game of June, with IGN calling it "the most accessible game design tool ever created, and that core is just one part of a greater whole..." Since its launch three weeks ago, fans have already built over 2 million custom stages, NPR notes -- but the real news is that Super Mario Maker 2 finally represents a shift in Nintendo's attitude towards its fan community: It's Nintendo's reliance on the creative spirit of these dedicated players that makes the Super Mario Maker series such a quietly radical property within the Nintendo canon... By loosening its grip on a beloved property and tossing the keys to the player community, Nintendo feeds into the fan-obsessive tendencies they've previously refused. With the Super Mario Maker series, Nintendo acknowledges the history of competitive speedrunning, tournament play, and even the masochistic fan games that have made their games visible and interesting in an entirely different way. It's the rare Nintendo game that is depending on those players, creators, and spectators to keep it alive. Super Mario Maker 2 has only been out for a few weeks, but already we've seen how the game's deceptively complex course editor has led to the community making some astounding levels... Nintendo has always been old-school in the way they rely on offline experiences, downplaying the kind of online communities that other developers prioritize. Ironically, it is that indifference that has made fan communities formed around Nintendo games feel singular and special -- they're smaller, more intimate, and regulated by the players themselves. With the Super Mario Maker franchise, Nintendo finally acknowledges the power and influence of its most obsessive fans -- by creating something that couldn't thrive without them. IGN argues that "it's astonishing how incredibly well it's all held together in one cohesive package... It does nearly everything better than its already excellent predecessor, introducing some incredible new ideas, level styles, building items, and so much more - all while maintaining the charm of Mario games we know and love." And Slashdot reader omfglearntoplay writes "If you like old games from the 1980s, this is your game."

Read more of this story at Slashdot.

14:47

GNOME Shell + Mutter 3.33.4 Released [Phoronix]

Florian Müllner released new development versions of GNOME Shell and Mutter today for this week's GNOME 3.33.4 development milestone...

14:34

If This Type of Dark Matter Existed, People Would Be Dying of Unexplained Wounds [Slashdot]

sciencehabit shared this article from Science magazine: Dark matter, the mysterious substance that makes up most of the mass of the universe, has proved notoriously hard to detect. But scientists have now proposed a surprising new sensor: human flesh. The idea boils down to this: If a certain type of dark matter particle existed, it would occasionally kill people, passing through them like a bullet. Because no one has died from unexplained gunshot-like wounds, this type of dark matter does not exist, according to a new study... [It's title? "Death by Dark Matter."] This experiment doesn't rule out heavy macro dark matter altogether, says Robert Scherrer, a co-author and theoretical physicist at Vanderbilt University. It merely eliminates a certain range of them. Heavier macro dark matter would not occur frequently enough to measure, notes Katherine Freese, a theoretical physicist at the University of Michigan, and other forms wouldn't kill people. "There is probably still room for very heavy dark matter," says Paolo Gorla, a particle physicist at Italy's underground Gran Sasso National Laboratory, who is not involved with the study.

Read more of this story at Slashdot.

13:34

Is Russia Trying to Deanonymize Tor Traffic? [Slashdot]

A contractor for Russia's intelligence agency suffered a breach, revealing projects they were pursuing -- including one to deanonymize Tor traffic. An anonymous reader shared this report from ZDNet: The breach took place last weekend, on July 13, when a group of hackers going by the name of 0v1ru$ hacked into SyTech's Active Directory server from where they gained access to the company's entire IT network, including a JIRA instance. Hackers stole 7.5TB of data from the contractor's network, and they defaced the company's website with a "yoba face," an emoji popular with Russian users that stands for "trolling..." Per the different reports in Russian media, the files indicate that SyTech had worked since 2009 on a multitude of projects. In February ZDNet reported that Russia disconnected itself from the rest of the internet in a test -- and suggests today that it was a real-world test of one of these leaked "secret projects" from the Russian intelligence agency. But the other projects include: Nautilus-S - a project for deanonymizing Tor traffic with the help of rogue Tor servers. Nautilus - a project for collecting data about social media users (such as Facebook, MySpace, and LinkedIn). Reward - a project to covertly penetrate P2P networks, like the one used for torrents. Mentor - a project to monitor and search email communications on the servers of Russian companies. Tax-3 - a project for the creation of a closed intranet to store the information of highly-sensitive state figures, judges, and local administration officials, separate from the rest of the state's IT networks. ZDNet also reports that the Tor-deanonymizing project, started in 2012, "appears to have been tested in the real world," citing a 2014 paper which found 18 malicious Tor exit nodes located in Russia. Each of those hostile Russian exit nodes used version 0.2.2.37 of Tor -- the same one described in these leaked files.

Read more of this story at Slashdot.

12:34

New 'HBO Max' Streaming Service Will Include a 'Dune' TV Series [Slashdot]

An anonymous reader quotes Android Authority: Studios like Disney and NBCUniversal are making preparations to launch their own streaming services, and they are planning to take back their back catalog of films and TV series with them. That's also what's happening with WarnerMedia, the AT&T-owned entertainment group that operates, among many other things, HBO, Warner Bros, and CNN. Recently, the conglomerate announced its own upcoming dedicated streaming service, HBO Max... Unconfirmed reports from Hollywood trade news outlets claim that HBO Max will cost between $16 and $17 a month. The service will be ad-free, although some reports have indicated that WarnerMedia might launch an ad-supported version of HBO Max at some point after the official launch in 2020. If that happens, it's likely the cost to sign up will be much less... While HBO Max will have quite a lot for subscribers to watch from WarnerMedia's library of content, it will have its own range of original TV shows and movies that will be found exclusively on the streaming service. They will be known as Max Originals. Here's what has been announced for HBO Max so far, which includes a couple of spin-offs from current and upcoming Warner Bros. series: Dune: The Sisterhood: Based on the classic Dune sci-fi novels by Frank Herbert, this 10-part series will focus on the Bene Gesserit group of women in this universe. Denis Villeneuve, who is directing the upcoming feature film adaptation of Dune, will also direct the pilot episode of the series. Gremlins -- The Animated Series: The mischievous and destructive creatures from the two Gremlins feature films will return as an animated series on HBO Max... A beta version of the service may launch before the end of 2019, according to Deadline. The studio's announcement also promised that HBO Max woud also include previously-announced HBO programs, including: Stephen King's The Outsider, a dark mystery starring Ben Mendelsohn, produced and directed by Jason Bateman. Lovecraft Country, a unique horror series based on a novel by Matt Ruff, written and executive produced by Misha Green, and executive produced by Jordan Peele (Us) and J.J. Abrams (Westworld). The Nevers, Joss Whedon's new science fiction series starring Laura Donnelly.

Read more of this story at Slashdot.

11:34

New Map Shows Where America's Police, Businesses Are Using Facial Recognition and Other Surveillance Tech [Slashdot]

"Fight For the Future, a tech-focused nonprofit, on Thursday released its Ban Facial Recognition map, logging the states and cities using surveillance technology," reports CNET -- noting that "surveillance technology" in this case includes Amazon's Ring doorbell security cameras. A CNET investigation earlier this year highlighted the close ties between Ring and police departments across the US, many of which offer free or discounted Ring doorbells using taxpayer money. The cameras have helped police create an easily accessible surveillance network in neighborhoods and allowed law enforcement to request videos through an app. The arrangement has critics worried about the erosion of privacy. Until the release of Fight for the Future's map, there was no comprehensive directory of all the police departments that had partnered with Ring. Now you can find them by going on the map and toggling it to "Police (Local)." It lists more than 40 cities where police have partnered with Amazon for Ring doorbells.... The map is far from complete. Police departments aren't always up front about the technology that they're using. On the interactive map, Fight for the Future asked visitors to send it any new entries to add to the map.... The map also has filters for airports, stores and stadiums that are using facial recognition, as well as states that provide driver's license photos to the FBI's database of faces... . Fight for the Future's map also features a filter for regions where facial recognition use by government is banned. For now, that's only in San Francisco; Somerville, Massachusetts; and Oakland, California. The group's deputy director told CNET that the map's goal is allowing people "to turn their ambient anxiety into effective action by pushing at the local and state level to ban this dangerous tech. "No amount of regulation will fix the threat posed by facial recognition," he added. "It must be banned."

Read more of this story at Slashdot.

10:34

'Caloric Restriction' Study Finds Surprising Health Benefits [Slashdot]

The New York Times reports positive results from the first major clinical study of caloric restriction (funded by America's National Institutes of Health) in which 143 healthy volunteers ate (on average) 300 calories less each day: They lost weight and body fat. Their cholesterol levels improved, their blood pressure fell slightly, and they had better blood sugar control and less inflammation. At the same time, a control group of 75 healthy people who did not practice caloric restriction saw no improvements in any of these markers. Some of the benefits in the calorie restricted group stemmed from the fact that they lost a large amount of weight, on average about 16 pounds over the two years of the study. But the extent to which their metabolic health got better was greater than would have been expected from weight loss alone, suggesting that caloric restriction might have some unique biological effects on disease pathways in the body, said William Kraus, the lead author of the study and a professor of medicine and cardiology at Duke University. "We weren't surprised that there were changes," he said. "But the magnitude was rather astounding. In a disease population, there aren't five drugs in combination that would cause this aggregate of an improvement...." The researchers looked at measures of quality of life and discovered that the calorie-restricted group reported better sleep, increased energy and improved mood.... One question the study could not answer was whether caloric restriction could extend life span in humans the way that it can in other animals... But ultimately, caloric restriction did have a beneficial impact on a wide range of risk factors for diabetes and heart disease, two conditions that cause death and disability for millions of Americans, especially as they get older. Asked about the study, the chairman of the nutrition department at Harvard's School of Public Health questioned whether caloric restriction would be practical for most people, given that "we are living in an obesogenic environment with an abundance of energy-dense, nutrient-poor foods that are cheap, accessible and heavily marketed."

Read more of this story at Slashdot.

09:34

Python 3.8 Will Finally Include the Walrus Operator [Slashdot]

An anonymous reader quotes LWN: Python 3.8 is feature complete at this point, which makes it a good time to see what will be part of it when the final release is made. That is currently scheduled for October, so users don't have that long to wait to start using those new features. The headline feature for Python 3.8 is also its most contentious. The process for deciding on Python Enhancement Proposal (PEP) 572 ("Assignment Expressions") was a rather bumpy ride that eventually resulted in a new governance model for the language. That model meant that a new steering council would replace longtime benevolent dictator for life (BDFL) Guido van Rossum for decision-making, after Van Rossum stepped down in part due to the "PEP 572 mess". Out of that came a new operator, however, that is often called the "walrus operator" due to its visual appearance. Using ":=" in an if or while statement allows assigning a value to a variable while testing it... It is a feature that many other languages have, but Python has, of course, gone without it for nearly 30 years at this point. In the end, it is actually a fairly small change for all of the uproar it caused.

Read more of this story at Slashdot.

09:29

Systemd Introduces A New & Practical Service For Dealing With PStore [Phoronix]

Adding to the list of new features for systemd 243 is another last-minute addition to this growing init system... Systemd picked up a new service and while some may view it as bloat, should be quite practical at least for those encountering kernel crashes from time to time...

08:34

Facebook Backpedals From Its Original Ambitious Vision for Libra [Slashdot]

An anonymous reader quotes Ars Technica: David Marcus, the head of Facebook's new Calibra payments division, appeared before two hostile congressional committees this week with a simple message: Facebook knows policymakers are concerned about Libra, and Facebook won't move forward with the project until their concerns are addressed. While he didn't say so explicitly, Marcus' comments at hearings on Tuesday and Wednesday represented a dramatic shift in Facebook's conception of Libra. In Facebook's original vision, Libra would be an open and largely decentralized network, akin to Bitcoin. The core network would be beyond the reach of regulators. Regulatory compliance would be the responsibility of exchanges, wallets, and other services that are the "on ramps and off ramps" to the Libra ecosystem. Facebook now seems to recognize its original vision was a non-starter with regulators. So this week Marcus sketched out a new vision for Libra -- one in which the Libra Association will shoulder significant responsibility for ensuring compliance with laws relating to money laundering, terrorist financing, and other financial crimes... [T]here's a pretty fundamental tradeoff between network openness and effective enforcement of regulations governing payment networks. If the Libra Association doesn't have a way to enforce compliance by wallet providers, criminals are likely to flock to wallet services that don't strictly enforce the rules -- or to download open source wallet software and use non-custodial accounts. But if the Libra Association does have a mechanism for forcing compliance, that inherently raises the bar for entering the market and makes the Libra network look more like conventional financial networks -- with all the red tape that entails. This could be particularly harmful for marginalized people in developing countries, since developers in those markets will have the fewest resources to jump through regulatory hoops.

Read more of this story at Slashdot.

07:00

Employers Are Mining the Data Their Workers Generate To Figure Out What They're Up To, and With Whom [Slashdot]

An anonymous reader quotes a report from The Wall Street Journal: To be an employee of a large company in the U.S. now often means becoming a workforce data generator -- from the first email sent from bed in the morning to the Wi-Fi hotspot used during lunch to the new business contact added before going home. Employers are parsing those interactions to learn who is influential, which teams are most productive and who is a flight risk. Companies, which have wide legal latitude in the U.S. to monitor workers, don't always tell them what they are tracking. [...] It's not just emails that are being tallied and analyzed. Companies are increasingly sifting through texts, Slack chats and, in some cases, recorded and transcribed phone calls on mobile devices. Microsoft Corp. tallies data on the frequency of chats, emails and meetings between its staff and clients using its own Office 365 services to measure employee productivity, management efficacy and work-life balance. Tracking the email, chats and calendar appointments can paint a picture of how employees spend an average of 20 hours of their work time each week, says Natalie McCollough, a general manager at Microsoft who focuses on workplace analytics. The company only allows managers to look at groups of five or more workers. Advocates of using surveillance technology in the workplace say the insights allow companies to better allocate resources, spot problem employees earlier and suss out high performers. Critics warn that the proliferating tools may not be nuanced enough to result in fair, equitable judgments. The report says that "U.S. employers are legally entitled to access any communications or intellectual property created in the workplace or on devices they pay for that employees use for work." Companies are getting smarter by analyzing phone calls and conference room conversations. "In some cases, tonal analysis can help diagnose culture issues on a team, showing who dominates conversations, who demurs and who resists efforts to engage in emotional discussions," the report says.

Read more of this story at Slashdot.

05:19

RadeonSI Gallium3D Driver Adds Navi Wave32 Support [Phoronix]

One of the new features to the RDNA architecture with Navi is support for single cycle issue Wave32 execution on SIMD32. Up to now the RadeonSI code was using just Wave64 but now there is support in this AMD open-source Linux OpenGL driver for Wave32...

04:30

Saturday Morning Breakfast Cereal - Humility [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I'm just realizing most of my Life Tips involve emotionally damaging other people.


Today's News:

04:18

The Arm SoC/Platform Changes Finally Sent In For Linux 5.3: Jetson Nano, New SoCs [Phoronix]

The Arm SoC/platform changes arrived a bit late to the Linux 5.3 merge window ending this weekend. The Arm SoC/platform changes were only sent in on Friday night but include Librem 5 Developer Kit support in terms of the DeviceTree bits as well as improving the NVIDIA Jetson Nano support and various other SoC/platform additions...

04:00

Largest Hybrid Electric Plane Set To Take Flight [Slashdot]

Ampair, a Los Angeles clean tech company in my neck of the woods, is set to begin accepting orders for a hybrid electric aircraft at the EAA AirVenture airshow in Wisconsin next week. Dubbed the EEL, the aircraft is in fact a retrofit of a Cessna 337, an aircraft that has a forward-mounted prop engine that pulls and a rear-mounted prop engine that pushes. Ampair's retrofit will replace one of those internal combustion engines with an electric motor powered by batteries. ZDNet reports: Ampair believes hybrid power may be a stopgap, providing fuel savings while still retaining many of the benefits of an internal combustion drivetrain. "The Ampaire Electric EEL is the first step in bringing lower emissions, lower-operating costs, and quieter operations to general aviation through electrification," according to the company's CEO Kevin Noertker. "The original Cessna 337 provided great utility, and this hybrid electric conversion retains those advantages while reducing fuel cost and maintenance by about 50 percent." The EEL is now undergoing a 30 month test program, which began in June. One of the tests will be demonstrating reliable single-engine climbs on each powerplant. Ampair expects the aircraft to be certified by 2021. Ampair's EEL aircraft will seat four or six passengers. The company says the aircraft cost will be competitive with comparable piston twins.

Read more of this story at Slashdot.

03:31

Brussels changes its mind AGAIN on .EU domains: Euro citizens in post-Brexit Britain can keep them after all [The Register]

Your periodic reminder that there was maybe a reason so many people voted to leave

The European Commission has, yet again, changed its position on who can have a .eu domain after Brexit.…

02:49

NFS Changes On Linux 5.3 Will Allow Clients To Use New "nconnect" Mount Option [Phoronix]

Sent out on Thursday were the NFS client updates for the Linux 5.3 kernel merge window. This time around are a few interesting changes...

01:41

British ISPs throw in the towel, give up sending out toothless copyright infringement warnings [The Register]

What a waste of time and money that was

Creative Content UK, the organization that terrified British internet users by requiring ISPs to send out emails with accusations of copyright infringement, has decided to drop this questionable practice.…

00:25

Weston 7.0 Reaches Alpha With PipeWire, HDCP, EGL Partial Updates & Mores [Phoronix]

Wayland release manager Simon Ser announced the alpha release of the Weston 7.0 reference compositor on Friday that also marks the feature freeze for this Wayland compositor update...

00:00

Quantum Leap From Australian Research Promises Super-Fast Computing Power [Slashdot]

An anonymous reader quotes a report from The Guardian: Simmons, a former Australian of the Year, and her team at the University of New South Wales announced in a paper published in Nature journal on Thursday that they have been able to achieve the first two-qubit gate between atom qubits in silicon, allowing them to communicate with each other at a 200 times faster rate than previously achieved at 0.8 nanoseconds. A two-qubit gate operates like a logic gate in traditional computing, and the team at UNSW was able to achieve the faster operation by putting the two atom qubits closer together than ever before -- just 13 nanometers -- and in real-time controllably observing and measuring their spin states. A scanning tunneling microscope was used to place the atoms in silicon after the optimal distance between the two qubits had been worked out. The research has been two decades in the making, after researchers in Australia opted to build a quantum computer on silicon material.

Read more of this story at Slashdot.

Friday, 19 July

22:06

Intel / Clear Linux Is Looking For Your Feedback On Your Linux Development Workflow [Phoronix]

Intel's Clear Linux crew has launched a twelve-question survey seeking feedback on your Linux usage though the survey slightly caters towards developers. While the survey is being put out by Intel's performance-oriented Linux distribution, users of any Linux platform are encouraged to participate...

21:30

Professor Patrick Winston, Former Director of MIT's Artificial Intelligence Laboratory, Dies At 76 [Slashdot]

Patrick Winston, a beloved professor and computer scientist at MIT, died on July 19 at Massachusetts General Hospital in Boston. He was 76. MIT News reports: A professor at MIT for almost 50 years, Winston was director of MIT's Artificial Intelligence Laboratory from 1972 to 1997 before it merged with the Laboratory for Computer Science to become MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). A devoted teacher and cherished colleague, Winston led CSAIL's Genesis Group, which focused on developing AI systems that have human-like intelligence, including the ability to tell, perceive, and comprehend stories. He believed that such work could help illuminate aspects of human intelligence that scientists don't yet understand. He was renowned for his accessible and informative lectures, and gave a hugely popular talk every year during the Independent Activities Period called "How to Speak." Winston's dedication to teaching earned him many accolades over the years, including the Baker Award, the Eta Kappa Nu Teaching Award, and the Graduate Student Council Teaching Award.

Read more of this story at Slashdot.

20:03

James Bond Was Going To Fight Robot Sharks With Nukes In New York's Sewers [Slashdot]

dryriver writes: The line "sharks with fricking lasers" was once popular on Slashdot. It sounds like a joke, but a never-made James Bond movie co-written back in the day by Sean Connery was actually going to feature robotic sharks carrying stolen NATO nukes in order to attack New York. Bond was going to stop the sharks inside the New York sewer system, waterski out of the sewers, paraglide up to the Statue of Liberty's head, then fight a Bond villain inside said head, with the villain's "blood trickling out of the Statue of Liberty's eye like tears" at the end of the fight. All this was going to happen without the consent of Cubby Broccoli, the official producer of the Bond movies. Why did the movie never get made? The producers of competing Bond movies were fighting in court over who has what rights to the franchise and characters. In the end, "Bond fights robot sharks with nukes" was scrapped, and "Never Say Never Again," a remake of "Thunderball," was made instead. This featured stolen nukes as well, but unfortunately no robot sharks or other "Austin Powers" style silliness.

Read more of this story at Slashdot.

17:58

In the cooler for the next three years: Hacker of iCloud accounts used by athletes and rappers [The Register]

Phishing led to shopping spree with victims' credit cards

A man from the US state of Georgia who pleaded guilty in March to breaking into the Apple iCloud accounts of sports and entertainment figures was sentenced on Thursday to three years and one month in federal prison – and ordered to pay almost $700,000 in restitution.…

16:20

Enjoying that 25Mbps internet speed, America? Oh, it's just 6Mbps? And you're unhappy? Can't imagine why [The Register]

Internet testers went down to Georgia and found a fraud

Comment  American internet users are, seemingly, getting a quarter of the internet speed they are paying for.…

14:55

When Harry met celly: NSA hoarder thrown in the clink for 9 years – after taking classified work home for decades [The Register]

Contractor Martin sentenced for squirreling away 50TB of hush-hush files, exploits

An ex-NSA contractor who admitted stashing some 50TB of secret US government documents and exploit code at his home was today sentenced to nine years behind bars.…

14:30

Literally braking news: Two people hurt as not one but two self-driving space-age buses go awry [The Register]

One hit by robo-ride, another injured in US, Austrian trials

Two driverless vehicle trials were temporarily halted this week after self-driving mini-buses encountered obstacles – or think they did – resulting in minor injuries to a rider and a pedestrian.…

14:00

All very MoD-ern: RAF test pilot headed into space with Virgin, £30m small sat demo project [The Register]

Defence ministry gets with the Apollo vibes

Roundup  As the world celebrates the 50th anniversary of the Apollo 11 Moon mission, the UK's Ministry of Defence has gone a bit wacky – not only does it have fresh space plans, but it also wants to strap laser zappers to stuff too.…

14:00

Pango 1.44 Is Coming Thanks To The Revival By GNOME Developers [Phoronix]

Back in May there were the plans shared by Red Hat's Matthias Clasen to work out some improvements to the Pango layout engine library after going fairly stale in recent years. That work is coming to fruition with a Pango 1.44 release looking like it will be here soon with new features...

13:20

Zstd 1.4.1 Further Improves Decode Speed, Other Optimizations [Phoronix]

Zstd 1.4.1 is out today as a maintenance release to Facebook's Zstandard compression algorithm but with this update comes even more performance optimizations...

13:00

You'll never guess what US mad lads Throwflame have strapped to a drone (clue: it does exactly what it says on the tin) [The Register]

America. Fsck yeah

US firm Throwflame is selling – you guessed it – a flamethrower attachment for drones.…

12:00

France seeks science-fiction writers to help futureproof its military against science-fact [The Register]

'Aliens'

The French army has said it is looking to recruit four or five sci-fi writers and futurologists to staff a "Red Team" that predicts future threats and how to disrupt or defend against them.…

11:00

Israel's NSO Group: Our malware? Slurp your cloud backups plus phone data? They've misunderstood [The Register]

After report claimed its sales pitches boasted of doing that

Israeli spyware firm NSO Group has denied it developed malware that can steal user data from cloud services run by Amazon, Apple, Facebook, Google and Microsoft.…

10:00

UK.gov drives ever further into Nocluesville, crowdsources how to solve digital identity [The Register]

Look, we're all of out ideas!

Fresh out of ideas on how to crack the problem of digital identity, the UK government has put out a consultation asking what the hell it should do next.…

09:00

Echelon gets the upper hand: Scores final nod for 100MW bit barn campus in Arklow, Ireland [The Register]

Appeal from competitor who fought Apple application fails to stop project

Recently established data centre developer Echelon has received permission to build a massive bit barn campus in County Wicklow, Ireland.…

08:09

Estate agent dodges GDPR-sized bullet after exposing 18,610 folks' data for two years [The Register]

Fined £80,000 under Data Protection Act – could have been a lot more under new EU rules

A London estate agent has been fined £80,000 for losing thousands of clients' personal data when it was handed over to a third party.…

07:15

You totally need VMs to do AI, nods VMware as Bitfusion dissolves in its vSphere of influence [The Register]

Virty GPU upstart joins the fold

Crusty old VMware is attempting to keep up with the youngsters by acquiring Bitfusion, a startup that claims to enable machine learning on any VM via the magic of network-attached GPUs.…

07:00

Your biz won't be hacked by a super-leet exploit. It'll be Bob in sales opening a dodgy email [The Register]

Or Sam connecting a vulnerable dev box to production. Here's your gentle guide to risks and threats menacing your IT

Backgrounder  The good news for enterprise security is that the number of reported cyberattacks is going down, in the UK at least.…

07:00

Securing infrastructure at scale with Cloudflare Access [The Cloudflare Blog]

Securing infrastructure at scale with Cloudflare Access

I rarely have to deal with the hassle of using a corporate VPN and I hope it remains this way. As a new member of the Cloudflare team, that seems possible. Coworkers who joined a few years ago did not have that same luck. They had to use a VPN to get any work done. What changed?

Cloudflare released Access, and now we’re able to do our work without ever needing a VPN again. Access is a way to control access to your internal applications and infrastructure. Today, we’re releasing a new feature to help you replace your VPN by deploying Access at an even greater scale.

Access in an instant

Access replaces a corporate VPN by evaluating every request made to a resource secured behind Access. Administrators can make web applications, remote desktops, and physical servers available at dedicated URLs, configured as DNS records in Cloudflare. These tools are protected via access policies, set by the account owner, so that only authenticated users can access those resources. These end users are able to be authenticated over both HTTPS and SSH requests. They’re prompted to login with their SSO credentials and Access redirects them to the application or server.

For your team, Access makes your internal web applications and servers in your infrastructure feel as seamless to reach as your SaaS tools. Originally we built Access to replace our own corporate VPN. In practice, this became the fastest way to control who can reach different pieces of our own infrastructure. However, administrators configuring Access were required to create a discrete policy per each application/hostname. Now, administrators don’t have to create a dedicated policy for each new resource secured by Access; one policy will cover each URL protected.

When Access launched, the product’s primary use case was to secure internal web applications. Creating unique rules for each was tedious, but manageable. Access has since become a centralized way to secure infrastructure in many environments. Now that companies are using Access to secure hundreds of resources, that method of building policies no longer fits.

Starting today, Access users can build policies using a wildcard subdomain to replace the typical bottleneck that occurs when replacing dozens or even hundreds of bespoke rules within a single policy. With a wildcard, the same ruleset will now automatically apply to any subdomain your team generates that is gated by Access.

How can teams deploy at scale with wildcard subdomains?

Administrators can secure their infrastructure with a wildcard policy in the Cloudflare dashboard. With Access enabled, Cloudflare adds identity-based evaluation to that traffic.

In the Access dashboard, you can now build a rule to secure any subdomain of the site you added to Cloudflare. Create a new policy and enter a wildcard tag (“*”) into the subdomain field. You can then configure rules, at a granular level, using your identity provider to control who can reach any subdomain of that apex domain.

Securing infrastructure at scale with Cloudflare Access

This new policy will propagate to all 180 of Cloudflare’s data centers in seconds and any new subdomains created will be protected.

Securing infrastructure at scale with Cloudflare Access

How are teams using it?

Since releasing this feature in a closed beta, we’ve seen teams use it to gate access to their infrastructure in several new ways. Many teams use Access to secure dev and staging environments of sites that are being developed before they hit production. Whether for QA or collaboration with partner agencies, Access helps make it possible to share sites quickly with a layer of authentication. With wildcard subdomains, teams are deploying dozens of versions of new sites at new URLs without needing to touch the Access dashboard.

For example, an administrator can create a policy for “*.example.com” and then developers can deploy iterations of sites at “dev-1.example.com” and “dev-2.example.com” and both inherit the global Access policy.

The feature is also helping teams lock down their entire hybrid, on-premise, or public cloud infrastructure with the Access SSH feature. Teams can assign dynamic subdomains to their entire fleet of servers, regardless of environment, and developers and engineers can reach them over an SSH connection without a VPN. Administrators can now bring infrastructure online, in an entirely new environment, without additional or custom security rules.

What about creating DNS records?

Cloudflare Access requires users to associate a resource with a domain or subdomain. While the wildcard policy will cover all subdomains, teams will still need to connect their servers to the Cloudflare network and generate DNS records for those services.

Argo Tunnel can reduce that burden significantly. Argo Tunnel lets you expose a server to the Internet without opening any inbound ports. The service runs a lightweight daemon on your server that initiates outbound tunnels to the Cloudflare network.

Instead of managing DNS, network, and firewall complexity, Argo Tunnel helps administrators serve traffic from their origin through Cloudflare with a single command. That single command will generate the DNS record in Cloudflare automatically, allowing you to focus your time on building and managing your infrastructure.

What’s next?

More teams are adopting a hybrid or multi-cloud model for deploying their infrastructure. In the past, these teams were left with just two options for securing those resources: peering a VPN with each provider or relying on custom IAM flows with each environment. In the end, both of these solutions were not only quite costly but also equally unmanageable.

While infrastructure benefits from becoming distributed, security is something that is best when controlled in a single place. Access can consolidate how a team controls who can reach their entire fleet of servers and services.

06:38

The Open-Source NVIDIA "Nouveau" Driver Gets A Batch Of Fixes For Linux 5.3 [Phoronix]

With last week's big DRM pull request for Linux 5.3 that brought Navi support most notably on the AMD side while Intel received HDR display support, continued Icelake/Gen11 work, and more, there weren't any changes to the open-source NVIDIA "Nouveau" driver. It was another unfortunate cycle of no major improvements for the Nouveau driver but at least sent out today were a set of new "fixes" for this driver that remains crippled on Maxwell GPUs and newer...

06:31

Excluding Huawei from UK's 5G will harm security, MPs warn [The Register]

A decision must be made as a 'matter of urgency', says Intelligence and Security Committee

Excluding Huawei from the UK's 5G network infrastructure would harm resilience and "lower security standards", the Intelligence and Security Committee (ISC) warned today.…

06:25

Saturday Morning Breakfast Cereal - GAN [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
The other day I was really freaked out that a computer could generate faces of people who DON'T REALLY EXIST, only to later realize painters have been doing this for several millenia.


Today's News:

05:50

BT staffers fear new mums could be hit disproportionately by car allowance change [The Register]

People Framework strikes again: 'Employees feel worse off and demoted'

A number of female staff at BT that take maternity leave could be disproportionately affected by plans to remove car allowances from certain employees.…

05:05

Apollo 11 @ 50: The long shadow of the flag [The Register]

1969 and all that, or why NASA shouldn't be all about lunar footprints

Tomorrow it will be a full 50 years since humans first set foot on the Moon, and nearly 50 years since the annual hand-wringing began over why none have gone back after Apollo 17.…

04:25

Wayland's Weston Lands A Pipewire Plug-In As New Remote Desktop Streaming Option [Phoronix]

Wayland's Weston compositor for the past year has provided a remoting plug-in for virtual output streaming that was built atop RTP/GStreamer. Now though a new plug-in has landed in the Weston code-base making use of Red Hat's promising PipeWire project...

04:25

2025: HELLO? WHAT? I CAN'T HEAR YOU, I'M ON THE TUBE. FULL 4G NOW. NAH, IT'S CRAP [The Register]

Transport for London missed the train for 2019 deadline

Transport for London is to trial 4G services on the eastern half of the Jubilee line, and is looking to work with a firm that wants to run a Underground-wide network by the mid-2020s.…

04:07

Libinput 1.14 RC Arrives With Better Thumb Detection & Dell Canvas Totem Support [Phoronix]

Linux input expert Peter Hutterer of Red Hat shipped the much anticipated release candidate today for libinput 1.14, the open-source input handling library used by both X.Org and Wayland systems...

03:41

Guess who reserved their seat on the first Moon flight? My mum, that's who [The Register]

Of course the ticket's real: it says 'Pan Am' at the top, doesn't it?

Something for the Weekend, Sir?  My mother won a ticket to the Moon.…

03:00

UK government buys off Serco lawsuit with £10m bung. Whew. Now Capita can start running fire and rescue [The Register]

Makes you proud to be British

The Ministry of Defence has slipped £10m of British taxpayers' money into Serco's back pocket to settle a legal challenge over the award of a £525m Fire and Rescue services contract to rival outsourcer Capita.…

02:15

What else can we add to UK.gov's tech project bonfire? Oh yeah, 5G [The Register]

Watchdog casts doubt on testbeds and trials scheme

The UK's £217m 5G testbed trials have already hit a major speed bump due to a lack of available equipment, according to an official report.…

02:00

Modifying Windows local accounts with Fedora and chntpw [Fedora Magazine]

I recently encountered a problem at work where a client’s Windows 10 PC lost trust to the domain. The user is an executive and the hindrance of his computer can affect real-time mission-critical tasks. He gave me 30 minutes to resolve the issue while he attended a meeting.

Needless to say, I’ve encountered this issue many times in my career. It’s an easy fix using the Windows 7/8/10 installation media to reset the Administrator password, remove the PC off the domain and rejoin it. Unfortunately it didn’t work this time. After 20 minutes of scouring the net and scanning through the Microsoft Docs with no success, I turned to my development machine running Fedora with hopes of finding a solution.

With dnf search I found a utility called chntpw:

$ dnf search windows | grep password

According to the summary, chntpw will “change passwords in Windows SAM files.”

Little did I know at the time there was more to this utility than explained in the summary. Hence, this article will go through the steps I used to successfully reset a Windows local user password using chntpw and a Fedora Workstation Live boot USB. The article will also cover some of the features of chntpw used for basic user administration.

Installation and setup

If the PC can connect to the internet after booting the live media, install chntpw from the official Fedora repository with:

$ sudo dnf install chntpw

If you’re unable to access the internet, no sweat! Fedora Workstation Live boot media has all the dependencies installed out-of-the-box, so all we need is the package. You can find the builds for your Fedora version from the Fedora Project’s Koji site. You can use another computer to download the utility and use a USB thumb drive, or other form of media to copy the package.

First and foremost we need to create the Fedora Live USB stick. If you need instructions, the article on How to make a Fedora USB stick is a great reference.

Once the key is created shut-down the Windows PC, insert the thumb drive if the USB key was created on another computer, and turn on the PC — be sure to boot from the USB drive. Once the live media boots, select “Try Fedora” and open the Terminal application.

Also, we need to mount the Windows drive to access the files. Enter the following command to view all drive partitions with an NTFS filesystem:

$ sudo blkid | grep ntfs

Most hard drives are assigned to /dev/sdaX where X is the partition number — virtual drives may be assigned to /dev/vdX, and some newer drives (like SSDs) use /dev/nvmeX. For this example the Windows C drive is assigned to /dev/sda2. To mount the drive enter:

$ sudo mount /dev/sda2 /mnt

Fedora Workstation contains the ntfs-3g and ntfsprogs packages out-of-the-box. If you’re using a spin that does not have NTFS working out of the box, you can install these two packages from the official Fedora repository with:

$ sudo dnf install ntfs-3g ntfsprogs

Once the drive is mounted, navigate to the location of the SAM file and verify that it’s there:

$ cd /mnt/Windows/System32/config
$ ls | grep SAM
SAM
SAM.LOG1
SAM.LOG2

Clearing or resetting a password

Now it’s time to get to work. The help flag -h provides everything we need to know about this utility and how to use it:

$ chntpw -h
chntpw: change password of a user in a Windows SAM file,
or invoke registry editor. Should handle both 32 and 64 bit windows and
all version from NT3.x to Win8.1
chntpw [OPTIONS] [systemfile] [securityfile] [otherreghive] […]
-h This message
-u Username or RID (0x3e9 for example) to interactively edit
-l list all users in SAM file and exit
-i Interactive Menu system
-e Registry editor. Now with full write support!
-d Enter buffer debugger instead (hex editor),
-v Be a little more verbose (for debuging)
-L For scripts, write names of changed files to /tmp/changed
-N No allocation mode. Only same length overwrites possible (very safe mode)
-E No expand mode, do not expand hive file (safe mode)

Usernames can be given as name or RID (in hex with 0x first)
See readme file on how to get to the registry files, and what they are.
Source/binary freely distributable under GPL v2 license. See README for details.
NOTE: This program is somewhat hackish! You are on your own!

Use the -l parameter to display a list of users it reads from the SAM file:

$ sudo chntpw -l SAM
chntpw version 1.00 140201, (c) Petter N Hagen
Hive name (from header): <\SystemRoot\System32\Config\SAM>
ROOT KEY at offset: 0x001020 * Subkey indexing type is: 686c
File size 65536 [10000] bytes, containing 7 pages (+ 1 headerpage)
Used for data: 346/37816 blocks/bytes, unused: 23/7016 blocks/bytes.

| RID -|---------- Username ------------| Admin? |- Lock? --|
| 01f4 | Administrator | ADMIN | dis/lock |
| 01f7 | DefaultAccount | | dis/lock |
| 03e8 | defaultuser0 | | dis/lock |
| 01f5 | Guest | | dis/lock |
| 03ea | sysadm | ADMIN | |
| 01f8 | WDAGUtilityAccount | | dis/lock |
| 03e9 | WinUser | | |

Now that we have a list of Windows users we can edit the account. Use the -u parameter followed by the username and the name of the SAM file. For this example, edit the sysadm account:

$ sudo chntpw -u sysadm SAM
chntpw version 1.00 140201, (c) Petter N Hagen
Hive name (from header): <\SystemRoot\System32\Config\SAM>
ROOT KEY at offset: 0x001020 * Subkey indexing type is: 686c
File size 65536 [10000] bytes, containing 7 pages (+ 1 headerpage)
Used for data: 346/37816 blocks/bytes, unused: 23/7016 blocks/bytes.

================= USER EDIT ====================

RID : 1002 [03ea]
Username: sysadm
fullname: SysADM
comment :
homedir :

00000220 = Administrators (which has 2 members)

Account bits: 0x0010 =
[ ] Disabled | [ ] Homedir req. | [ ] Passwd not req. |
[ ] Temp. duplicate | [X] Normal account | [ ] NMS account |
[ ] Domain trust ac | [ ] Wks trust act. | [ ] Srv trust act |
[ ] Pwd don't expir | [ ] Auto lockout | [ ] (unknown 0x08) |
[ ] (unknown 0x10) | [ ] (unknown 0x20) | [ ] (unknown 0x40) |

Failed login count: 0, while max tries is: 0
Total login count: 0

- - - User Edit Menu:
1 - Clear (blank) user password
(2 - Unlock and enable user account) [seems unlocked already]
3 - Promote user (make user an administrator)
4 - Add user to a group
5 - Remove user from a group
q - Quit editing user, back to user select
Select: [q] >

To clear the password press 1 and ENTER. If successful you will see the following message:

...
Select: [q] > 1
Password cleared!
================= USER EDIT ====================

RID : 1002 [03ea]
Username: sysadm
fullname: SysADM
comment :
homedir :

00000220 = Administrators (which has 2 members)

Account bits: 0x0010 =
[ ] Disabled | [ ] Homedir req. | [ ] Passwd not req. |
[ ] Temp. duplicate | [X] Normal account | [ ] NMS account |
[ ] Domain trust ac | [ ] Wks trust act. | [ ] Srv trust act |
[ ] Pwd don't expir | [ ] Auto lockout | [ ] (unknown 0x08) |
[ ] (unknown 0x10) | [ ] (unknown 0x20) | [ ] (unknown 0x40) |

Failed login count: 0, while max tries is: 0
Total login count: 0
** No NT MD4 hash found. This user probably has a BLANK password!
** No LANMAN hash found either. Try login with no password!
...

Verify the change by repeating:

$ sudo chntpw -l SAM
chntpw version 1.00 140201, (c) Petter N Hagen
Hive name (from header): <\SystemRoot\System32\Config\SAM>
ROOT KEY at offset: 0x001020 * Subkey indexing type is: 686c
File size 65536 [10000] bytes, containing 7 pages (+ 1 headerpage)
Used for data: 346/37816 blocks/bytes, unused: 23/7016 blocks/bytes.

| RID -|---------- Username ------------| Admin? |- Lock? --|
| 01f4 | Administrator | ADMIN | dis/lock |
| 01f7 | DefaultAccount | | dis/lock |
| 03e8 | defaultuser0 | | dis/lock |
| 01f5 | Guest | | dis/lock |
| 03ea | sysadm | ADMIN | *BLANK* |
| 01f8 | WDAGUtilityAccount | | dis/lock |
| 03e9 | WinUser | | |

...

The “Lock?” column now shows BLANK for the sysadm user. Type q to exit and y to write the changes to the SAM file. Reboot the machine into Windows and login using the account (in this case sysadm) without a password.

Features

Furthermore, chntpw can perform basic Windows user administrative tasks. It has the ability to promote the user to the administrators group, unlock accounts, view and modify group memberships, and edit the registry.

The interactive menu

chntpw has an easy-to-use interactive menu to guide you through the process. Use the -i parameter to launch the interactive menu:

$ chntpw -i SAM
chntpw version 1.00 140201, (c) Petter N Hagen
Hive name (from header): <\SystemRoot\System32\Config\SAM>
ROOT KEY at offset: 0x001020 * Subkey indexing type is: 686c
File size 65536 [10000] bytes, containing 7 pages (+ 1 headerpage)
Used for data: 346/37816 blocks/bytes, unused: 23/7016 blocks/bytes.

<>========<> chntpw Main Interactive Menu <>========<>
Loaded hives:
1 - Edit user data and passwords
2 - List groups
- - -
9 - Registry editor, now with full write support!
q - Quit (you will be asked if there is something to save)

Groups and account membership

To display a list of groups and view its members, select option 2 from the interactive menu:

...
What to do? [1] -> 2
Also list group members? [n] y
=== Group # 220 : Administrators
0 | 01f4 | Administrator |
1 | 03ea | sysadm |
=== Group # 221 : Users
0 | 0004 | NT AUTHORITY\INTERACTIVE |
1 | 000b | NT AUTHORITY\Authenticated Users |
2 | 03e8 | defaultuser0 |
3 | 03e9 | WinUser |
=== Group # 222 : Guests
0 | 01f5 | Guest |
=== Group # 223 : Power Users
...
=== Group # 247 : Device Owners

Adding the user to the administrators group

To elevate the user with administrative privileges press 1 to edit the account, then 3 to promote the user:

...
Select: [q] > 3

=== PROMOTE USER
Will add the user to the administrator group (0x220)
and to the users group (0x221). That should usually be
what is needed to log in and get administrator rights.
Also, remove the user from the guest group (0x222), since
it may forbid logins.

(To add or remove user from other groups, please other menu selections)

Note: You may get some errors if the user is already member of some
of these groups, but that is no problem.

Do it? (y/n) [n] : y

Adding to 0x220 (Administrators) …
sam_put_user_grpids: success exit
Adding to 0x221 (Users) …
sam_put_user_grpids: success exit
Removing from 0x222 (Guests) …
remove_user_from_grp: NOTE: group not in users list of groups, may mean user not member at all. Safe. Continuing.
remove_user_from_grp: NOTE: user not in groups list of users, may mean user was not member at all. Does not matter, continuing.
sam_put_user_grpids: success exit

Promotion DONE!

Editing the Windows registry

Certainly the most noteworthy, as well as the most powerful, feature of chntpw is the ability to edit the registry and write to it. Select 9 from the interactive menu:

...
What to do? [1] -> 9
Simple registry editor. ? for help.

> ?
Simple registry editor:
hive [] - list loaded hives or switch to hive number
cd - change current key
ls | dir [] - show subkeys & values,
cat | type - show key value
dpi - show decoded DigitalProductId value
hex - hexdump of value data
ck [] - Show keys class data, if it has any
nk - add key
dk - delete key (must be empty)
ed - Edit value
nv - Add value
dv - Delete value
delallv - Delete all values in current key
rdel - Recursively delete key & subkeys
ek - export key to (Windows .reg file format)
debug - enter buffer hexeditor
st [] - debug function: show struct info
q - quit

Finding help

As we saw earlier, the -h parameter allows us to quickly access a reference guide to the options available with chntpw. The man page contains detailed information and can be accessed with:

$ man chntpw

Also, if you’re interested in a more hands-on approach, spin up a virtual machine. Windows Server 2019 has an evaluation period of 180 days, and Windows Hyper-V Server 2019 is unlimited. Creating a Windows guest VM will provide the basics to modify the Administrator account for testing and learning. For help with quickly creating a guest VM refer to the article Getting started with virtualization in Gnome Boxes.

Conclusion

chntpw is a hidden gem for Linux administrators and IT professionals alike. While a nifty tool to quickly reset Windows account passwords, it can also be used to troubleshoot and modify local Windows accounts with a no-nonsense feel that delivers. This is perhaps only one such tool for solving the problem, though. If you’ve experienced this issue and have an alternative solution, feel free to put it in the comments below.

This tool, like many other “hacking” tools, holds with it an ethical responsibility. Even chntpw states:

NOTE: This program is somewhat hackish! You are on your own!

When using such programs, we should remember the three edicts outlined in the message displayed when running sudo for the first time:

  1. Respect the privacy of others.
  2. Think before you type.
  3. With great power comes great responsibility.

Photo by Silas Köhler on Unsplash,

01:10

Ubuntu 20.04 LTS Server Planning A New Means For Automated Installations [Phoronix]

Canonical's server team is working on a new means of carrying out automated installations of Ubuntu Server in time for their 20.04 LTS release...

01:03

Operation Desert Sh!tstorm: Routine test shoots down military's top-secret internets [The Register]

How our sysadmin learned to fear vSphere

On Call  Welcome to Friday. The weekend is almost upon us so put down that bacon sarnie and pick up today's On Call, The Register's weekly column of tales from the tipping point.…

00:38

2015 database hack is the terrible gift that keeps giving for Slack: Tens of thousands of passwords now reset [The Register]

Yak app still cleaning up after four-year-old cyber-break-in

Slack says a 2015 database theft is to blame for a large-scale reset of stolen passwords.…

00:00

Don't miss out: Learn essential AI skills from dozens of top-notch speakers, sessions and workshops at this year's MCubed conference [The Register]

Get your early-bird tickets for talks ranging from technical deep-dives to advice on ethics and the law

Event  Your early experiments in machine learning or AI can be tough. Getting them to work in production will be tougher still.…

Thursday, 18 July

23:56

RISC-V's Kernel Support Continues Maturing With Linux 5.3 [Phoronix]

In-step with more RISC-V hardware becoming available over time, the Linux kernel architecture support for RISC-V has continued maturing and with Linux 5.3 is in better shape...

23:03

Incognito mode won't stop smut sites sharing your pervy preferences with Facebook, Google and, er, Oracle [The Register]

93% of onanistic orgs tracking you despite battened-down browsing

Google, Facebook, and, surprisingly, Oracle are among the top ten third-party companies that frequently track your personal sexual interests every time you watch porn, according to new research.…

22:00

AMDGPU/AMDKFD Queue Up Early Linux 5.3 Fixes For Navi & More [Phoronix]

While the Linux 5.3 kernel merge window isn't even over until this weekend when it will kick off with 5.3-rc1 and headlining new features like Radeon RX 5700 series support, AMD has already sent in a batch of AMDGPU/AMDKFD fixes. Making these fixes notable are some early fixes around the new open-source Radeon RX "Navi" support...

18:38

Arrested development: Cops dump Amazon's facial-recognition API after struggling to make the thing work properly [The Register]

15 months of wrangling and Orlando couldn't even begin testing AI cloud tech for population surveillance

Orlando cops have given up using Amazon’s controversial cloud-based facial recognition to monitor CCTV cameras dotted around the Florida city – after a nightmare year of technical breakdowns.…

18:31

F2FS Is The Latest Linux File-System With Patches For Case-Insensitive Support [Phoronix]

Following EXT4 getting initial (and opt-in) support for case-insensitive directories/files, the Flash-Friendly File-System has a set of patches pending that extend the case-folding support to this F2FS file-system that is becoming increasingly used by Android smartphones and other devices...

17:04

Cloud makes it rain for Microsoft: IT giant turns green with Azure, cash poured all over investors [The Register]

Record revenue reaches Redmond, the result of a booming as-a-service business

Microsoft on Thursday reported record revenue for its fourth fiscal quarter of 2019 and for its full fiscal year, predictably pushing its stock higher in after-hours trading.…

15:54

The pro-privacy Browser Act has re-appeared in US Congress. But why does everyone except right-wing trolls hate it? [The Register]

Martha Blackburn's bill is everything wrong with 2019 in 13 pages

Comment  A bi-partisan law bill that promises to give internet users far greater control over their privacy made another appearance in US Congress on Thursday.…

14:57

It's never good when 'Magecart' and 'bulletproof' appear in the same sentence, but here we are [The Register]

Ukrainian civil war a bonanza for dodgy malware hosting firms

A growing crop of so-called bulletproof hosting companies are using the ongoing civil war in Ukraine to host Magecart malware without fear of the police coming knocking.…

14:57

VirtIO-PMEM Driver Added To Linux 5.3 For Paravirtualized Persistent Memory [Phoronix]

In addition to Linux 5.3 bringing a VirtIO-IOMMU driver, this next kernel version is bringing another new VirtIO virtual device implementation: PMEM for para-virtualized persistent memory support for the likes of Intel Optane DC persistent memory...

14:03

npm uninstall co-founder --global: Laurie Voss rides off into the sunset waving goodbye [The Register]

Co-founder and chief data officer at NPM Inc, moves on

Updated  Laurie Voss, the co-founder and chief data officer of widely used JavaScript package registry NPM Inc, today announced in a blog post that he left the company on July 1.…

13:09

We don't mean to poo-poo this, but... The Internet of S**t has literally arrived thanks to Pampers smart diapers [The Register]

You can monitor your child – and their bowels – 24 hours a day! Which is… great?

What's that unpleasant whiff? No, it's not little Johnny's sticky bowel movement but the new "smart diaper" containing his special effort.…

12:36

Oracle Linux 8.0 Released [Phoronix]

In early May right before the release of Red Hat Enterprise Linux 8.0 we saw the public beta of Oracle Linux 8 while today Oracle Linux 8.0 has been promoted to stable and production ready...

11:30

We need citizen devs, cries Microsoft – but pricey new licensing plans for PowerApps might put paid to that [The Register]

'Empowering' if you've got $$$

Microsoft introduced new licensing plans for its PowerApps platform at the Inspire partner conference this week to accommodate the increasing capabilities available.…

10:33

Soon Google will have more bit barns in Texas than you can shake a stick at: Second facility planned for Ellis County [The Register]

'Alamo Mission LLC' to splash $600m on city of Red Oak

Google has identified a parcel of land for its second bit barn in Ellis County, Texas, even though the first one is still months away from completion.…

10:00

LLVM 9.0 Feature Work Is Over While LLVM 10.0 Enters Development [Phoronix]

Feature work is over on LLVM 9.0 as the next release for this widely-used compiler stack ranging from the AMDGPU shader compiler back-end to the many CPU targets and other innovative use-cases for this open-source compiler infrastructure...

09:52

DRAM, is it cold in here? Semiconductor market expected to shrink 12% in 2019 [The Register]

Device numbers growing, but manufacturers are using fewer/cheaper chips per device

With average prices for semiconductor components going down dramatically in 2019, especially DRAM and NAND, major chipmakers have been forced to reduce their production output. As a result, the silicon market is expected to shrink by 12 per cent year-on-year.…

09:02

Corsair Force MP600 PCIe 4.0 NVMe SSD Benchmarks On Linux [Phoronix]

One of the first PCIe Gen 4 NVMe SSDs to market has been the Corsair Force MP600. AMD included the Corsair MP600 2TB NVMe PCIe4 SSD with their Ryzen 3000 reviewer's kit and for those interested in this speedy solid-state storage here are some benchmarks compared to various other storage devices on Ubuntu Linux.

08:58

Bulgaria hack: 20-year-old infosec whizz cuffed after 'adult population's' finance deets nicked [The Register]

Bosses stick up for suspect, claim he's being framed for pinching 5m folks' data

A 20-year-old infosec bod has been arrested in Bulgaria after most of the country's population had their personal and financial details stolen.…

08:27

Hip and modern IBM can't beat legacy kit and services IBM: That's four consecutive quarters of revenue decline now [The Register]

8% cloud biz growth in 12 months. Where is Red Hat when you need it?

IBM notched up its fourth straight quarter of revenue decline as the areas it deems strategic – hybrid cloud, AI and blockchain – couldn't paper over cracks in the legacy operations of big iron and outsourcing.…

08:12

A Tale of Two (APT) Transports [The Cloudflare Blog]

A Tale of Two (APT) Transports

Securing access to your APT repositories is critical. At Cloudflare, like in most organizations, we used a legacy VPN to lock down who could reach our internal software repositories. However, a network perimeter model lacks a number of features that we consider critical to a team’s security.

As a company, we’ve been moving our internal infrastructure to our own zero-trust platform, Cloudflare Access. Access added SaaS-like convenience to the on-premise tools we managed. We started with web applications and then moved resources we need to reach over SSH behind the Access gateway, for example Git or user-SSH access. However, we still needed to handle how services communicate with our internal APT repository.

We recently open sourced a new APT transport which allows customers to protect their private APT repositories using Cloudflare Access. In this post, we’ll outline the history of APT tooling, APT transports and introduce our new APT transport for Cloudflare Access.

A brief history of APT

Advanced Package Tool, or APT, simplifies the installation and removal of software on Debian and related Linux distributions. Originally released in 1998, APT was to Debian what the App Store was to modern smartphones - a decade ahead of its time!

APT sits atop the lower-level dpkg tool, which is used to install, query, and remove .deb packages - the primary software packaging format in Debian and related Linux distributions such as Ubuntu. With dpkg, packaging and managing software installed on your system became easier - but it didn’t solve for problems around distribution of packages, such as via the Internet or local media; at the time of inception, it was commonplace to install packages from a CD-ROM.

APT introduced the concept of repositories - a mechanism for storing and indexing a collection of .deb packages. APT supports connecting to multiple repositories for finding packages and automatically resolving package dependencies. The way APT connects to said repositories is via a “transport” - a mechanism for communicating between the APT client and its repository source (more on this later).

APT over the Internet

Prior to version 1.5, APT did not include support for HTTPS - if you wanted to install a package over the Internet, your connection was not encrypted. This reduces privacy - an attacker snooping traffic could determine specific package version your system is installing. It also exposes you to man-in-the-middle attacks where an attacker could, for example, exploit a remote code execution vulnerability. Just 6 months ago, we saw an example of the latter with CVE-2019-3462.

Enter the APT HTTPS transport - an optional transport you can install to add support for connecting to repositories over HTTPS. Once installed, users need to configure their APT sources.list with repositories using HTTPS.

The challenge here, of course, is that the most common way to install this transport is via APT and HTTP - a classic bootstrapping problem! An alternative here is to download the .deb package via curl and install it via dpkg. You’ll find the links to apt-transport-https binaries for Stretch here - once you have the URL path for your system architecture, you can download it from the deb.debian.org mirror-redirector over HTTPS, e.g. for amd64 (a.k.a. x86_64):

curl -o apt-transport-https.deb -L https://deb.debian.org/debian/pool/main/a/apt/apt-transport-https_1.4.9_amd64.deb 
HASH=c8c4366d1912ff8223615891397a78b44f313b0a2f15a970a82abe48460490cb && echo "$HASH  apt-transport-https.deb" | sha256sum -c
sudo dpkg -i apt-transport-https.deb

To confirm which APT transports are installed on your system, you can list each “method binary” that is installed:

ls /usr/lib/apt/methods

With apt-transport-https installed you should now see ‘https’ in that list.

The state of APT & HTTPS on Debian

You may be wondering how relevant this APT HTTPS transport is today. Given the prevalence of HTTPS on the web today, I was surprised when I found out exactly how relevant it is.

Up until a couple of weeks ago, Debian Stretch (9.x) was the current stable release; 9.0 was first released in June 2017 - and the latest version (9.9) includes apt 1.4.9 by default - meaning that securing your APT communication for Debian Stretch requires installing the optional apt-transport-https package.

Thankfully, on July 6 of this year, Debian released the latest version - Buster - which currently includes apt 1.8.2 with HTTPS support built-in by default, negating the need for installing the apt-transport-https package - and removing the bootstrapping challenge of installing HTTPS support via HTTPS!

BYO HTTPS APT Repository

A powerful feature of APT is the ability to run your own repository. You can mirror a public repository to improve performance or protect against an outage. And if you’re producing your own software packages, you can run your own repository to simplify distribution and installation of your software for your users.

If you have your own APT repository and you’re looking to secure it with HTTPS we’ve offered free Universal SSL since 2014 and last year introduced a way to require it site-wide automatically with one click. You’ll get the benefits of DDoS attack protection, a Global CDN with Caching, and Analytics.

But what if you’re looking for more than just HTTPS for your APT repository? For companies operating private APT repositories, authentication of your APT repository may be a challenge. This is where our new, custom APT transport comes in.

Building custom transports

The system design of APT is powerful in that it supports extensibility via Transport executables, but how does this mechanism work?

When APT attempts to connect to a repository, it finds the executable which matches the “scheme” from the repository URL (e.g. “https://” prefix on a repository results in the “https” executable being called).

APT then uses the common Linux standard streams: stdin, stdout, and stderr. It communicates via stdin/stdout using a set of plain-text Messages, which follow IETF RFC #822 (the same format that .deb “Package” files use).

Examples of input message include “600 URI Acquire”, and examples of output messages include “200 URI Start” and “201 URI Done”:

A Tale of Two (APT) Transports

If you’re interested in building your own transport, check out the APT method interface spec for more implementation details.

APT meets Access

Cloudflare prioritizes dogfooding our own products early and often. The Access product has given our internal DevTools team a chance to work closely with the product team as we build features that help solve use cases across our organization. We’ve deployed new features internally, gathered feedback, improved them, and then released them to our customers. For example, we’ve been able to iterate on tools for Access like the Atlassian SSO plugin and the SSH feature, as collaborative efforts between DevTools and the Access team.

Our DevTools team wanted to take the same dogfooding approach to protect our internal APT repository with Access. We knew this would require a custom APT transport to support generating the required tokens and passing the correct headers in HTTPS requests to our internal APT repository server. We decided to build and test our own transport that both generated the necessary tokens and passed the correct headers to allow us to place our repository behind Access.

After months of internal use, we’re excited to announce that we have recently open-sourced our custom APT transport, so our customers can also secure their APT repositories by enabling authentication via Cloudflare Access.

By protecting your APT repository with Cloudflare Access, you can support authenticating users via Single-Sign On (SSO) providers, defining comprehensive access-control policies, and monitoring access and change logs.

Our APT transport leverages another Open Source tool we provide, cloudflared, which enables users to connect to your Cloudflare-protected domain securely.

Securing your APT Repository

To use our APT transport, you’ll need an APT repository that’s protected by Cloudflare Access. Our instructions (below) for using our transport will use apt.example.com as a hostname.

To use our APT transport with your own web-based APT repository, refer to our Setting Up Access guide.

APT Transport Installation

To install from source, both tools require Go - once you install Go, you can install `cloudflared` and our APT transport with four commands:

go get github.com/cloudflare/cloudflared/cmd/cloudflared
sudo cp ${GOPATH:-~/go}/bin/cloudflared /usr/local/bin/cloudflared
go get github.com/cloudflare/apt-transport-cloudflared/cmd/cfd
sudo cp ${GOPATH:-~/go}/bin/cfd /usr/lib/apt/methods/cfd

The above commands should place the cloudflared executable in /usr/local/bin (which should be on your PATH), and the APT transport binary in the required /usr/lib/apt/methods directory.

To confirm cloudflared is on your path, run:

which cloudflared

The above command should return /usr/local/bin/cloudflared

Now that the custom transport is installed, to start using it simply configure an APT source with the cfd:// rather than https:// e.g:

$ cat /etc/apt/sources.list.d/example.list 
deb [arch=amd64] cfd://apt.example.com/v2/stretch stable common

Next time you do `apt-get update` and `apt-get install`, a browser window will open asking you to log-in over Cloudflare Access, and your package will be retrieved using the token returned by `cloudflared`.

Fetching a GPG Key over Access

Usually, private APT repositories will use SecureApt and have their own GPG public key that users must install to verify the integrity of data retrieved from that repository.

Users can also leverage cloudflared for securely downloading and installing those keys, e.g:

cloudflared access login https://apt.example.com
cloudflared access curl https://apt.example.com/public.gpg | sudo apt-key add -

The first command will open your web browser allowing you to authenticate for your domain. The second command wraps curl to download the GPG key, and hands it off to `apt-key add`.

Cloudflare Access on "headless" servers

If you’re looking to deploy APT repositories protected by Cloudflare Access to non-user-facing machines (a.k.a. “headless” servers), opening a browser does not work. The good news is since February, Cloudflare Access supports service tokens - and we’ve built support for them into our APT transport from day one.

If you’d like to use service tokens with our APT transport, it’s as simple as placing the token in a file in the correct path; because the machine already has a token, there is also no dependency on `cloudflared` for authentication. You can find details on how to set-up a service token in the APT transport README.

What’s next?

As demonstrated, you can get started using our APT transport today - we’d love to hear your feedback on this!

This work came out of an internal dogfooding effort, and we’re currently experimenting with additional packaging formats and tooling. If you’re interested in seeing support for another format or tool, please reach out.

07:54

Big fat doubt hovers over UK.gov's Making Tax Digital, customs declaration IT projects [The Register]

Plus: Delivery on 9 more projects at 'major risk'

A raft of major government IT projects are in serious trouble, including flagship programmes such as HMRC's Making Tax Digital and its Customs Declaration Service.…

07:19

Saturday Morning Breakfast Cereal - Test [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
When cheating on tests, you really only want to look at work by humans, chimps, or parrots.


Today's News:

06:59

Those facial recognition trials in the UK? They should be banned, warns Parliamentary committee [The Register]

Latest call to halt creepy tech likely to fall on deaf ears

Updated  The UK government should slap a "moratorium on the current use of facial recognition technology, with "no further trials" until there is legal framework in place, a Parliamentary committee has warned today.…

06:45

Qualcomm fined €242m over 'predatory pricing' that helped to knock off British competitor Icera [The Register]

Too late for chip flinger, but a win for the EU taxpayer

The European Commission has issued American chip maker Qualcomm with a hefty €242m fine for anti-competitive practices.…

06:26

Ubuntu's Zsys Client/Daemon For ZFS On Linux Continues Maturing For Eoan [Phoronix]

Looking ahead to Ubuntu 19.10 as the cycle before Ubuntu 20.04 LTS, one of the areas exciting us with the work being done by Canonical is (besides the great upstream GNOME performance work) easily comes down to the work they are pursuing on better ZFS On Linux integration with even aiming to offer ZFS as a file-system option from their desktop installer. A big role in their ZoL play is also the new "Zsys" component they have been developing...

05:59

Japan and Greece collide as Toshiba's storage biz spinoff reborn as Kioxia [The Register]

Kioku (memory) + axia (value) = $$$

Logowatch  Toshiba Memory Corporation has emerged, reborn, from the depths of the strategy boutique, as Kioxia.…

05:34

Microsoft demos end-to-end voting verification system ElectionGuard, code will be on GitHub [The Register]

'Defending democracy' initiative to resist nation-state attacks

Microsoft has demonstrated its ElectionGuard electronic vote system at the Aspen Security Forum under way in Colorado and warned that nearly 10,000 of its customers have been targeted by nation-state attacks.…

04:52

'Member Ke3chang? They're still at it, you know. Euro diplomats targeted by 'China-based' hacker crew [The Register]

Click your mouse three times... there's no place like a back door to your machine - ESET

An old-school shadowy malware group believed to operate out of China has been targeting diplomats with what infosec researchers say is a previously undocumented backdoor.…

04:30

Ceph Sees "Lots Of Exciting Things" For Linux 5.3 Kernel [Phoronix]

For those making use of the Ceph fault-tolerant storage platform, a number of updated kernel bits are landing in Linux 5.3...

04:20

Red flag: Verify to be marked 'undeliverable' by gov projects watchdog [The Register]

*Digital identity crisis intensifies*

Exclusive  The UK government's troubled £154m digital identity project Verify is to be flagged red by Whitehall's major projects watchdog, meaning delivery looks unachievable - according to sources.…

04:17

DragonFlyBSD Pulls In The Radeon Driver Code From Linux 4.4 [Phoronix]

While the Linux 4.4 kernel is quite old (January 2016), DragonFlyBSD has now re-based its AMD Radeon kernel graphics driver against that release. It is at least a big improvement compared to its Radeon code having been derived previously from Linux 3.19...

04:09

Latte Dock 0.9 Beta Brings Wayland Improvements, Smoother Experience [Phoronix]

It's been over one year since the release of Latte Dock 0.8 as this KDE-aligned desktop dock while now the v0.9 release isn't too far away...

03:30

Elon Musk's new idea is to hook your noggin up to an AI – but is he just insane about the brain? [The Register]

We ask top boffins if the plan is good enough for skull drilling

Analysis  Silicon Valley bad boy Elon Musk's grand plan to build brain-machine interfaces to "achieve a symbiosis with artificial intelligence" is obviously more science fiction than fact at the moment.…

03:00

Rust in peace: Memory bugs in C and C++ code cause security issues so Microsoft is considering alternatives once again [The Register]

Redmond engineer hints at taking super-lang for a spin

Microsoft Security Response Center (MSRC) is waxing lyrical about the risks inherent in C and C++ coding, arguing it may be time to dump "unsafe legacy languages" and shift to more modern, safer ones.…

02:15

Banks bid legacy tech farewell as they sail to the cloud – but now all that infrastructure is in hands of the big three [The Register]

MPs hear how financial services are trying to improve stability in wake of TSB's meltdown

Shifting financial services to the public cloud risks creating an over-reliance on the "dominant" service providers, banking heads told MPs yesterday during an inquiry into IT outages in the sector.…

01:04

Dutch cops collar fella accused of crafting and flogging Office macro nasties to cyber-crooks [The Register]

Accused bloke cuffed after plod swoop on home

A 20-year-old man from the Netherlands accused of building and selling Office macro malware was arrested Wednesday.…

Wednesday, 17 July

23:57

Fresh stalkerware crop pops up on Google's Android Play Store, swiftly yanked offline [The Register]

130,000 have already downloaded creepware

Seven new stalkerware apps have been spotted for sale on the Android Play Store, despite Google's policy against the invasive monitoring tools.…

23:34

Don't give it away, give it away, give it away now, bot busting biz tells reCAPTCHA data serfs [The Register]

Instead of enriching Google, try making a market for click work

Analysis  Internet companies depend on free labor. Companies like Amazon, Facebook and Google rely upon content creators who give their work away for the sake of platform participation or perhaps naive altruism.…

22:13

The NVMe Patches To Support Linux On Newer Apple Macs Are Under Review [Phoronix]

At the start of the month we reported on out-of-tree kernel work to support Linux on the newer Macs. Those patches were focused on supporting Apple's NVMe drive behavior by the Linux kernel driver. That work has been evolving nicely and is now under review on the kernel mailing list...

15:48

ZFS On Linux Has Figured Out A Way To Restore SIMD Support On Linux 5.0+ [Phoronix]

Those running ZFS On Linux (ZoL) on post-5.0 (and pre-5.0 supported LTS releases) have seen big performance hits to the ZFS encryption performance in particular. That came due to upstream breaking an interface used by ZFS On Linux and admittedly not caring about ZoL due to it being an out-of-tree user. But now several kernel releases later, a workaround has been devised...

11:05

CompuLab Turns An 8-Core/16-Thread Xeon, 64GB RAM, NVIDIA Quadro RTX 4000 Into Fan-Less Computer [Phoronix]

Three years ago we checked out the CompuLab Airtop as a high-performance fanless PC. Back then it was exciting to passively cool an Intel Core i7 5775C, 16GB of RAM, SATA 3.0 SSD, and a GeForce GTX 950 graphics card. But now in 2019 thanks to the continued design improvements by CompuLab and ever advancing tech, their newly-launched CompuLab 3 can accommodate an eight-core / sixteen-thread Xeon CPU, 64GB of RAM, NVMe SSD storage, and a NVIDIA Quadro RTX 4000 graphics card without any fans!

10:15

Intel's Linux Driver To Load HuC Firmware By Default For Icelake+ [Phoronix]

For several generations now of Intel graphics there have been the GuC/HuC firmware binaries while beginning with Icelake "Gen 11" graphics those binary blobs will be loaded by default...

07:07

Mesa 19.2 Is Just Six Patches Away From Seeing OpenGL 4.6 Support [Phoronix]

Later this month marks two years since the release of OpenGL 4.6 and just ahead of that date it looks like Mesa could finally land its complete GL 4.6 implementation, at least as far as the Intel open-source graphics driver support is concerned...

06:36

Systemd 243 Is Getting Buttoned Up For Release With New Features & Fixes [Phoronix]

While it would have been nice seeing this next systemd release sooner due to the Zen 2 + RdRand issue with systemd yielding an unbootable system (that is now also being worked around with a BIOS upgrade), the systemd 243 release looks like it will take place in the near future...

05:39

Saturday Morning Breakfast Cereal - Quadrants [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Now, your homework is to make a detailed four-quadrant scatter plot.


Today's News:

05:30

OpenSUSE Enables LTO By Default For Tumbleweed - Smaller & Faster Binaries [Phoronix]

The past few months openSUSE developers have been working on enabling LTO by default for its packages while now finally with the newest release of the rolling-release openSUSE Tumbleweed this goal has been accomplished...

02:00

Bond WiFi and Ethernet for easier networking mobility [Fedora Magazine]

Sometimes one network interface isn’t enough. Network bonding allows multiple network connections to act together with a single logical interface. You might do this because you want more bandwidth than a single connection can handle. Or maybe you want to switch back and forth between your wired and wireless networks without losing your network connection.

The latter applies to me. One of the benefits to working from home is that when the weather is nice, it’s enjoyable to work from a sunny deck instead of inside. But every time I did that, I lost my network connections. IRC, SSH, VPN — everything goes away, at least for a moment while some clients reconnect. This article describes how I set up network bonding on my Fedora 30 laptop to seamlessly move from the wired connection my laptop dock to a WiFi connection.

In Linux, interface bonding is handled by the bonding kernel module. Fedora does not ship with this enabled by default, but it is included in the kernel-core package. This means that enabling interface bonding is only a command away:

sudo modprobe bonding

Note that this will only have effect until you reboot. To permanently enable interface bonding, create a file called bonding.conf in the /etc/modules-load.d directory that contains only the word “bonding”.

Now that you have bonding enabled, it’s time to create the bonded interface. First, you must get the names of the interfaces you want to bond. To list the available interfaces, run:

sudo nmcli device status

You will see output that looks like this:

DEVICE          TYPE      STATE         CONNECTION         
enp12s0u1       ethernet  connected     Wired connection 1
tun0            tun       connected     tun0               
virbr0          bridge    connected     virbr0             
wlp2s0          wifi      disconnected  --      
p2p-dev-wlp2s0  wifi-p2p disconnected  --      
enp0s31f6       ethernet  unavailable   --      
lo              loopback  unmanaged     --                 
virbr0-nic      tun       unmanaged     --       

In this case, there are two (wired) Ethernet interfaces available. enp12s0u1 is on a laptop docking station, and you can tell that it’s connected from the STATE column. The other, enp0s31f6, is the built-in port in the laptop. There is also a WiFi connection called wlp2s0. enp12s0u1 and wlp2s0 are the two interfaces we’re interested in here. (Note that it’s not necessary for this exercise to understand how network devices are named, but if you’re interested you can see the systemd.net-naming-scheme man page.)

The first step is to create the bonded interface:

sudo nmcli connection add type bond ifname bond0 con-name bond0

In this example, the bonded interface is named bond0. The “con-name bond0” sets the connection name to bond0; leaving this off would result in a connection named bond-bond0. You can also set the connection name to something more human-friendly, like “Docking station bond” or “Ben”

The next step is to add the interfaces to the bonded interface:

sudo nmcli connection add type ethernet ifname enp12s0u1 master bond0 con-name bond-ethernet
sudo nmcli connection add type wifi ifname wlp2s0 master bond0 ssid Cotton con-name bond-wifi

As above, the connection name is specified to be more descriptive. Be sure to replace enp12s0u1 and wlp2s0 with the appropriate interface names on your system. For the WiFi interface, use your own network name (SSID) where I use “Cotton”. If your WiFi connection has a password (and of course it does!), you’ll need to add that to the configuration, too. The following assumes you’re using WPA2-PSK authentication

sudo nmcli connection modify bond-wifi wifi-sec.key-mgmt wpa-psk
sudo nmcli connection edit bond-wif

The second command will bring you into the interactive editor where you can enter your password without it being logged in your shell history. Enter the following, replacing password with your actual password

set wifi-sec.psk password
save
quit

Now you’re ready to start your bonded interface and the secondary interfaces you created

sudo nmcli connection up bond0
sudo nmcli connection up bond-ethernet
sudo nmcli connection up bond-wifi

You should now be able to disconnect your wired or wireless connections without losing your network connections.

A caveat: using other WiFi networks

This configuration works well when moving around on the specified WiFi network, but when away from this network, the SSID used in the bond is not available. Theoretically, one could add an interface to the bond for every WiFi connection used, but that doesn’t seem reasonable. Instead, you can disable the bonded interface:

sudo nmcli connection down bond0

When back on the defined WiFi network, simply start the bonded interface as above.

Fine-tuning your bond

By default, the bonded interface uses the “load balancing (round-robin)” mode. This spreads the load equally across the interfaces. But if you have a wired and a wireless connection, you may want to prefer the wired connection. The “active-backup” mode enables this. You can specify the mode and primary interface when you are creating the interface, or afterward using this command (the bonded interface should be down):

sudo nmcli connection modify bond0 +bond.options "mode=active-backup,primary=enp12s0u1"

The kernel documentation has much more information about bonding options.

00:03

O novo escritório da Cloudflare em Lisboa [The Cloudflare Blog]

O novo escritório da Cloudflare em Lisboa

Eu fui o 24.º funcionário da Cloudflare e o primeiro a trabalhar fora de São Francisco. Trabalhei de casa, num escritório improvisado, a escrever uma parte do código do software da Cloudflare. Pouco depois, iniciei a contratação de uma equipa em Londres. Hoje, na Cloudflare London, a nossa a sede da EMEA, a região da Europa, do Médio Oriente e de África, temos mais de 200 pessoas a trabalhar no edifício histórico County Hall, em frente ao Parlamento Britânico. O meu escritório improvisado pertence ao passado.

O novo escritório da Cloudflare em Lisboa
CC BY-SA 2.0 image by Sridhar Saraf

Mas a Cloudflare não parou em Londres. Atualmente, temos pessoas em Munique, Singapura, Pequim, Austin, Texas Chicago e Champaign, Illinois, Nova Iorque, Washington DC, São José, Califórnia, Miami, Florida, Sydney, Austrália e também em São Francisco e Londres. E hoje anunciamos a abertura de um novo escritório em Lisboa, Portugal. Este verão, para a inauguração do escritório, irei a Lisboa juntamente com um pequeno grupo de pessoal técnico de vários escritórios da Cloudflare.

A partir de hoje, estamos a recrutar em Lisboa! Visite este link para ver todas as oportunidades existentes. Procuramos candidatos nas áreas de Engenharia, Segurança, Produto, Estratégia de produto, Investigação Tecnológica e Atendimento ao Cliente.

Se está interessado num cargo que não se encontra  no link acima indicado, envie um email para a nossa equipa de recrutamento: lisbonjobs@cloudflare.com.

O novo escritório da Cloudflare em Lisboa
CC BY-SA 2.0 Image by Rustam Aliyev

Foi há 30 anos que tive a primeira noção real de Lisboa. Foi através da publicação de John Le Carré, The Russia House (A casa da Rússia), em 1989. Tão real, claro, como qualquer visão do mundo de Le Carré:

[...] há dez anos atrás, por um capricho qualquer, Barley Blair, tendo herdado uns quantos milhares de uma remota tia, comprara uma casinha  mais que modesta em Lisboa, onde costumava descansar periodicamente dos trabalhos da sua multifacetada alma. Poderia ter sido na Cornualha, na Provença ou em Tombuctu. Mas acontecera-lhe gostar de Lisboa [...]

Não foi por acaso que escolhemos Lisboa. Este é o resultado de uma pesquisa pormenorizada, com a qual a Cloudflare pretendia encontrar uma nova cidade Europeia para a localização do seu novo escritório na área tecnológica. Em 2014, fui convidado para ir a Lisboa, enquanto orador da Sapo Codebits, e fiquei impressionado com a quantidade e a diversidade de talento técnico presente no evento. Posteriormente, visitámos 45 cidades em 29 países diferentes. A nossa lista final reduzia-se a 3 cidades.

A combinação de um elevado e crescente ecossistema de tecnologia, uma política de imigração atraente, a estabilidade política, o alto padrão de vida, assim como todos os factores logísticos, tais como o fuso horário (o mesmo que no Reino Unido) e os voos diretos para São Francisco, fizeram com que Lisboa fosse claramente a cidade vencedora.

Eu comecei a aprender Português há três meses…e estou ansioso por poder descobrir um país e uma cultura, e criar um novo technical hub para a Cloudflare.

Encontrámos um ecossistema tecnológico emergente, apoiado tanto pelo governo como por um conjunto de startups bastante interessantes, com os quais pretendemos colaborar, de forma a continuarmos a elevar o perfil de Lisboa.

00:00

Cloudflare's new Lisbon office [The Cloudflare Blog]

Cloudflare's new Lisbon office

I was the 24th employee of Cloudflare and the first outside of San Francisco. Working out of my spare bedroom, I wrote a chunk of Cloudflare’s software before starting to recruit a team in London. Today, Cloudflare London, our EMEA headquarters, has more than 200 people working in the historic County Hall building opposite the Houses of Parliament. My spare bedroom is ancient history.

Cloudflare's new Lisbon office
CC BY-SA 2.0 image by Sridhar Saraf

And Cloudflare didn’t stop at London. We now have people in Munich, Singapore, Beijing, Austin, TX, Chicago and Champaign, IL, New York, Washington, DC, San Jose, CA, Miami, FL, and Sydney, Australia, as well as San Francisco and London. And today we’re announcing the establishment of a new technical hub in Lisbon, Portugal. As part of that office opening I will be relocating to Lisbon this summer along with a small number of technical folks from other Cloudflare offices.

We’re recruiting in Lisbon starting today. Go here to see all the current opportunities. We’re looking for people to fill roles in Engineering, Security, Product, Product Strategy, Technology Research, and Customer Support.

Cloudflare's new Lisbon office
CC BY-SA 2.0 Image by Rustam Aliyev

My first real idea of Lisbon dates to 30 years ago with the 1989 publication of John Le Carré’s The Russia House. As real, of course, as any Le Carré view of the world:

[...] ten years ago on a whim Barley Blair, having inherited a stray couple of thousand from a remote aunt, bought himself a scruffy pied-a-terre in Lisbon, where he was accustomed to take periodic rests from the burden of his many-sided soul. It could have been Cornwall, it could have been Provence or Timbuktu. But Lisbon by an accident had got him [...]

Cloudflare’s choice of Lisbon, however, came not by way of an accident but a careful search for a new continental European city in which to locate a technical office. I had been invited to Lisbon back in 2014 to speak at SAPO Codebits and been impressed by the size and range of technical talent present at the event. Subsequently, we looked at 45 cities across 29 countries, narrowing down to a final list of three.

Lisbon’s combination of a large and growing existing tech ecosystem, attractive immigration policy, political stability, high standard of living, as well as logistical factors like time zone (the same as the UK) and direct flights to San Francisco made it the clear winner.

Eu comecei a aprender Português há três meses... and I’m looking forward to discovering a country and a culture, and building a new technical hub for Cloudflare. We have found a thriving local technology ecosystem, supported both by the government and a myriad of exciting startups, and we look forward to collaborating with them to continue to raise Lisbon's profile.

Tuesday, 16 July

08:09

Saturday Morning Breakfast Cereal - Heroism [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I think we could implement this in all sports, with the one downside that they'd be unwatchable.


Today's News:

Monday, 15 July

08:13

Saturday Morning Breakfast Cereal - OK [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Also your incipient total torso collapse will make laughter impossible.


Today's News:

Sunday, 14 July

09:59

07:23

Saturday Morning Breakfast Cereal - Golden Age [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
That poor little box-shaped robot really really wants to make lunch.


Today's News:

Saturday, 13 July

07:14

Saturday Morning Breakfast Cereal - Drink [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Dammit, is this one of those times where the bonus panel is better than the comic?


Today's News:

Friday, 12 July

09:45

Details of the Cloudflare outage on July 2, 2019 [The Cloudflare Blog]

Almost nine years ago, Cloudflare was a tiny company and I was a customer not an employee. Cloudflare had launched a month earlier and one day alerting told me that my little site, jgc.org, didn’t seem to have working DNS any more. Cloudflare had pushed out a change to its use of Protocol Buffers and it had broken DNS.

I wrote to Matthew Prince directly with an email titled “Where’s my dns?” and he replied with a long, detailed, technical response (you can read the full email exchange here) to which I replied:

From: John Graham-Cumming
Date: Thu, Oct 7, 2010 at 9:14 AM
Subject: Re: Where's my dns?
To: Matthew Prince

Awesome report, thanks. I'll make sure to call you if there's a
problem.  At some point it would probably be good to write this up as
a blog post when you have all the technical details because I think
people really appreciate openness and honesty about these things.
Especially if you couple it with charts showing your post launch
traffic increase.

I have pretty robust monitoring of my sites so I get an SMS when
anything fails.  Monitoring shows I was down from 13:03:07 to
14:04:12.  Tests are made every five minutes.

It was a blip that I'm sure you'll get past.  But are you sure you
don't need someone in Europe? :-)

To which he replied:

From: Matthew Prince
Date: Thu, Oct 7, 2010 at 9:57 AM
Subject: Re: Where's my dns?
To: John Graham-Cumming

Thanks. We've written back to everyone who wrote in. I'm headed in to
the office now and we'll put something on the blog or pin an official
post to the top of our bulletin board system. I agree 100%    
transparency is best.

And so, today, as an employee of a much, much larger Cloudflare I get to be the one who writes, transparently about a mistake we made, its impact and what we are doing about it.

The events of July 2

On July 2, we deployed a new rule in our WAF Managed Rules that caused CPUs to become exhausted on every CPU core that handles HTTP/HTTPS traffic on the Cloudflare network worldwide. We are constantly improving WAF Managed Rules to respond to new vulnerabilities and threats. In May, for example, we used the speed with which we can update the WAF to push a rule to protect against a serious SharePoint vulnerability. Being able to deploy rules quickly and globally is a critical feature of our WAF.

Unfortunately, last Tuesday’s update contained a regular expression that backtracked enormously and exhausted CPU used for HTTP/HTTPS serving. This brought down Cloudflare’s core proxying, CDN and WAF functionality. The following graph shows CPUs dedicated to serving HTTP/HTTPS traffic spiking to nearly 100% usage across the servers in our network.

CPU utilization in one of our PoPs during the incident

This resulted in our customers (and their customers) seeing a 502 error page when visiting any Cloudflare domain. The 502 errors were generated by the front line Cloudflare web servers that still had CPU cores available but were unable to reach the processes that serve HTTP/HTTPS traffic.

We know how much this hurt our customers. We’re ashamed it happened. It also had a negative impact on our own operations while we were dealing with the incident.

It must have been incredibly stressful, frustrating and frightening if you were one of our customers. It was even more upsetting because we haven’t had a global outage for six years.

The CPU exhaustion was caused by a single WAF rule that contained a poorly written regular expression that ended up creating excessive backtracking. The regular expression that was at the heart of the outage is (?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

Although the regular expression itself is of interest to many people (and is discussed more below), the real story of how the Cloudflare service went down for 27 minutes is much more complex than “a regular expression went bad”. We’ve taken the time to write out the series of events that led to the outage and kept us from responding quickly. And, if you want to know more about regular expression backtracking and what to do about it, then you’ll find it in an appendix at the end of this post.

What happened

Let’s begin with the sequence of events. All times in this blog are UTC.

At 13:42 an engineer working on the firewall team deployed a minor change to the rules for XSS detection via an automatic process. This generated a Change Request ticket. We use Jira to manage these tickets and a screenshot is below.

Three minutes later the first PagerDuty page went out indicating a fault with the WAF. This was a synthetic test that checks the functionality of the WAF (we have hundreds of such tests) from outside Cloudflare to ensure that it is working correctly. This was rapidly followed by pages indicating many other end-to-end tests of Cloudflare services failing, a global traffic drop alert, widespread 502 errors and then many reports from our points-of-presence (PoPs) in cities worldwide indicating there was CPU exhaustion.

Some of these alerts hit my watch and I jumped out of the meeting I was in and was on my way back to my desk when a leader in our Solutions Engineering group told me we had lost 80% of our traffic. I ran over to SRE where the team was debugging the situation. In the initial moments of the outage there was speculation it was an attack of some type we’d never seen before.

Cloudflare’s SRE team is distributed around the world, with continuous, around-the-clock coverage. Alerts like these, the vast majority of which are noting very specific issues of limited scopes in localized areas, are monitored in internal dashboards and addressed many times every day. This pattern of pages and alerts, however, indicated that something gravely serious had happened, and SRE immediately declared a P0 incident and escalated to engineering leadership and systems engineering.

The London engineering team was at that moment in our main event space listening to an internal tech talk. The talk was interrupted and everyone assembled in a large conference room and others dialed-in. This wasn’t a normal problem that SRE could handle alone, it needed every relevant team online at once.

At 14:00 the WAF was identified as the component causing the problem and an attack dismissed as a possibility. The Performance Team pulled live CPU data from a machine that clearly showed the WAF was responsible. Another team member used strace to confirm. Another team saw error logs indicating the WAF was in trouble. At 14:02 the entire team looked at me when it was proposed that we use a ‘global kill’, a mechanism built into Cloudflare to disable a single component worldwide.

But getting to the global WAF kill was another story. Things stood in our way. We use our own products and with our Access service down we couldn’t authenticate to our internal control panel (and once we were back we’d discover that some members of the team had lost access because of a security feature that disables their credentials if they don’t use the internal control panel frequently).

And we couldn’t get to other internal services like Jira or the build system. To get to them we had to use a bypass mechanism that wasn’t frequently used (another thing to drill on after the event). Eventually, a team member executed the global WAF kill at 14:07 and by 14:09 traffic levels and CPU were back to expected levels worldwide. The rest of Cloudflare's protection mechanisms continued to operate.

Then we moved on to restoring the WAF functionality. Because of the sensitivity of the situation we performed both negative tests (asking ourselves “was it really that particular change that caused the problem?”) and positive tests (verifying the rollback worked) in a single city using a subset of traffic after removing our paying customers’ traffic from that location.

At 14:52 we were 100% satisfied that we understood the cause and had a fix in place and the WAF was re-enabled globally.

How Cloudflare operates

Cloudflare has a team of engineers who work on our WAF Managed Rules product; they are constantly working to improve detection rates, lower false positives, and respond rapidly to new threats as they emerge. In the last 60 days, 476 change requests have been handled for the WAF Managed Rules (averaging one every 3 hours).

This particular change was to be deployed in “simulate” mode where real customer traffic passes through the rule but nothing is blocked. We use that mode to test the effectiveness of a rule and measure its false positive and false negative rate. But even in the simulate mode the rules actually need to execute and in this case the rule contained a regular expression that consumed excessive CPU.

As can be seen from the Change Request above there’s a deployment plan, a rollback plan and a link to the internal Standard Operating Procedure (SOP) for this type of deployment. The SOP for a rule change specifically allows it to be pushed globally. This is very different from all the software we release at Cloudflare where the SOP first pushes software to an internal dogfooding network point of presence (PoP) (which our employees pass through), then to a small number of customers in an isolated location, followed by a push to a large number of customers and finally to the world.

The process for a software release looks like this: We use git internally via BitBucket. Engineers working on changes push code which is built by TeamCity and when the build passes, reviewers are assigned. Once a pull request is approved the code is built and the test suite runs (again).

If the build and tests pass then a Change Request Jira is generated and the change has to be approved by the relevant manager or technical lead. Once approved deployment to what we call the “animal PoPs” occurs: DOG, PIG, and the Canaries.

The DOG PoP is a Cloudflare PoP (just like any of our cities worldwide) but it is used only by Cloudflare employees. This dogfooding PoP enables us to catch problems early before any customer traffic has touched the code. And it frequently does.

If the DOG test passes successfully code goes to PIG (as in “Guinea Pig”). This is a Cloudflare PoP where a small subset of customer traffic from non-paying customers passes through the new code.

If that is successful the code moves to the Canaries. We have three Canary PoPs spread across the world and run paying and non-paying customer traffic running through them on the new code as a final check for errors.

Cloudflare software release process

Once successful in Canary the code is allowed to go live. The entire DOG, PIG, Canary, Global process can take hours or days to complete, depending on the type of code change. The diversity of Cloudflare’s network and customers allows us to test code thoroughly before a release is pushed to all our customers globally. But, by design, the WAF doesn’t use this process because of the need to respond rapidly to threats.

WAF Threats

In the last few years we have seen a dramatic increase in vulnerabilities in common applications. This has happened due to the increased availability of software testing tools, like fuzzing for example (we just posted a new blog on fuzzing here).

Source: https://cvedetails.com/

What is commonly seen is a Proof of Concept (PoC) is created and often published on Github quickly, so that teams running and maintaining applications can test to make sure they have adequate protections. Because of this, it’s imperative that Cloudflare are able to react as quickly as possible to new attacks to give our customers a chance to patch their software.

A great example of how Cloudflare proactively provided this protection was through the deployment of our protections against the SharePoint vulnerability in May (blog here). Within a short space of time from publicised announcements, we saw a huge spike in attempts to exploit our customer’s Sharepoint installations. Our team continuously monitors for new threats and writes rules to mitigate them on behalf of our customers.

The specific rule that caused last Tuesday’s outage was targeting Cross-site scripting (XSS) attacks. These too have increased dramatically in recent years.

Source: https://cvedetails.com/

The standard procedure for a WAF Managed Rules change indicates that Continuous Integration (CI) tests must pass prior to a global deploy. That happened normally last Tuesday and the rules were deployed. At 13:31 an engineer on the team had merged a Pull Request containing the change after it was approved.

At 13:37 TeamCity built the rules and ran the tests, giving it the green light. The WAF test suite tests that the core functionality of the WAF works and consists of a large collection of unit tests for individual matching functions. After the unit tests run the individual WAF rules are tested by executing a huge collection of HTTP requests against the WAF. These HTTP requests are designed to test requests that should be blocked by the WAF (to make sure it catches attacks) and those that should be let through (to make sure it isn’t over-blocking and creating false positives). What it didn’t do was test for runaway CPU utilization by the WAF and examining the log files from previous WAF builds shows that no increase in test suite run time was observed with the rule that would ultimately cause CPU exhaustion on our edge.

With the tests passing, TeamCity automatically began deploying the change at 13:42.

Quicksilver

Because WAF rules are required to address emergent threats they are deployed using our Quicksilver distributed key-value (KV) store that can push changes globally in seconds. This technology is used by all our customers when making configuration changes in our dashboard or via the API and is the backbone of our service’s ability to respond to changes very, very rapidly.

We haven’t really talked about Quicksilver much. We previously used Kyoto Tycoon as a globally distributed key-value store, but we ran into operational issues with it and wrote our own KV store that is replicated across our more than 180 cities. Quicksilver is how we push changes to customer configuration, update WAF rules, and distribute JavaScript code written by customers using Cloudflare Workers.

From clicking a button in the dashboard or making an API call to change configuration to that change coming into effect takes seconds, globally. Customers have come to love this high speed configurability. And with Workers they expect near instant, global software deployment. On average Quicksilver distributes about 350 changes per second.

And Quicksilver is very fast.  On average we hit a p99 of 2.29s for a change to be distributed to every machine worldwide. Usually, this speed is a great thing. It means that when you enable a feature or purge your cache you know that it’ll be live globally nearly instantly. When you push code with Cloudflare Workers it's pushed out at the same speed. This is part of the promise of Cloudflare fast updates when you need them.

However, in this case, that speed meant that a change to the rules went global in seconds. You may notice that the WAF code uses Lua. Cloudflare makes use of Lua extensively in production and details of the Lua in the WAF have been discussed before. The Lua WAF uses PCRE internally and it uses backtracking for matching and has no mechanism to protect against a runaway expression. More on that and what we're doing about it below.

Everything that occurred up to the point the rules were deployed was done “correctly”: a pull request was raised, it was approved, CI/CD built the code and tested it, a change request was submitted with an SOP detailing rollout and rollback, and the rollout was executed.

Cloudflare WAF deployment process



What went wrong

As noted, we deploy dozens of new rules to the WAF every week, and we have numerous systems in place to prevent any negative impact of that deployment. So when things do go wrong, it’s generally the unlikely convergence of multiple causes. Getting to a single root cause, while satisfying, may obscure the reality. Here are the multiple vulnerabilities that converged to get to the point where Cloudflare’s service for HTTP/HTTPS went offline.

  1. An engineer wrote a regular expression that could easily backtrack enormously.
  2. A protection that would have helped prevent excessive CPU use by a regular expression was removed by mistake during a refactoring of the WAF weeks prior—a refactoring that was part of making the WAF use less CPU.
  3. The regular expression engine being used didn’t have complexity guarantees.
  4. The test suite didn’t have a way of identifying excessive CPU consumption.
  5. The SOP allowed a non-emergency rule change to go globally into production without a staged rollout.
  6. The rollback plan required running the complete WAF build twice taking too long.
  7. The first alert for the global traffic drop took too long to fire.
  8. We didn’t update our status page quickly enough.
  9. We had difficulty accessing our own systems because of the outage and the bypass procedure wasn’t well trained on.
  10. SREs had lost access to some systems because their credentials had been timed out for security reasons.
  11. Our customers were unable to access the Cloudflare Dashboard or API because they pass through the Cloudflare edge.

What’s happened since last Tuesday

Firstly, we stopped all release work on the WAF completely and are doing the following:

  1. Re-introduce the excessive CPU usage protection that got removed. (Done)
  2. Manually inspecting all 3,868 rules in the WAF Managed Rules to find and correct any other instances of possible excessive backtracking. (Inspection complete)
  3. Introduce performance profiling for all rules to the test suite. (ETA:  July 19)
  4. Switching to either the re2 or Rust regex engine which both have run-time guarantees. (ETA: July 31)
  5. Changing the SOP to do staged rollouts of rules in the same manner used for other software at Cloudflare while retaining the ability to do emergency global deployment for active attacks.
  6. Putting in place an emergency ability to take the Cloudflare Dashboard and API off Cloudflare's edge.
  7. Automating update of the Cloudflare Status page.

In the longer term we are moving away from the Lua WAF that I wrote years ago. We are porting the WAF to use the new firewall engine. This will make the WAF both faster and add yet another layer of protection.

Conclusion

This was an upsetting outage for our customers and for the team. We responded quickly to correct the situation and are correcting the process deficiencies that allowed the outage to occur and going deeper to protect against any further possible problems with the way we use regular expressions by replacing the underlying technology used.

We are ashamed of the outage and sorry for the impact on our customers. We believe the changes we’ve made mean such an outage will never recur.

Appendix: About Regular Expression Backtracking

To fully understand how (?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))  caused CPU exhaustion you need to understand a little about how a standard regular expression engine works. The critical part is .*(?:.*=.*). The (?: and matching ) are a non-capturing group (i.e. the expression inside the parentheses is grouped together as a single expression).

For the purposes of the discussion of why this pattern causes CPU exhaustion we can safely ignore it and treat the pattern as .*.*=.*. When reduced to this, the pattern obviously looks unnecessarily complex; but what's important is any "real-world" expression (like the complex ones in our WAF rules) that ask the engine to "match anything followed by anything" can lead to catastrophic backtracking. Here’s why.

In a regular expression, . means match a single character, .* means match zero or more characters greedily (i.e. match as much as possible) so .*.*=.* means match zero or more characters, then match zero or more characters, then find a literal = sign, then match zero or more characters.

Consider the test string x=x. This will match the expression .*.*=.*. The .*.* before the equal can match the first x (one of the .* matches the x, the other matches zero characters). The .* after the = matches the final x.

It takes 23 steps for this match to happen. The first .* in .*.*=.* acts greedily and matches the entire x=x string. The engine moves on to consider the next .*. There are no more characters left to match so the second .* matches zero characters (that’s allowed). Then the engine moves on to the =. As there are no characters left to match (the first .* having consumed all of x=x) the match fails.

At this point the regular expression engine backtracks. It returns to the first .* and matches it against x= (instead of x=x) and then moves onto the second .*. That .* matches the second x and now there are no more characters left to match. So when the engine tries to match the = in .*.*=.* the match fails. The engine backtracks again.

This time it backtracks so that the first .* is still matching x= but the second .* no longer matches x; it matches zero characters. The engine then moves on to try to find the literal = in the .*.*=.* pattern but it fails (because it was already matched against the first .*). The engine backtracks again.

This time the first .* matches just the first x. But the second .* acts greedily and matches =x. You can see what’s coming. When it tries to match the literal = it fails and backtracks again.

The first .* still matches just the first x. Now the second .* matches just =. But, you guessed it, the engine can’t match the literal = because the second .* matched it. So the engine backtracks again. Remember, this is all to match a three character string.

Finally, with the first .* matching just the first x, the second .* matching zero characters the engine is able to match the literal = in the expression with the = in the string. It moves on and the final .* matches the final x.

23 steps to match x=x. Here’s a short video of that using the Perl Regexp::Debugger showing the steps and backtracking as they occur.

That’s a lot of work but what happens if the string is changed from x=x to x=xx? This time is takes 33 steps to match. And if the input is x=xxx it takes 45. That’s not linear. Here’s a chart showing matching from x=x to x=xxxxxxxxxxxxxxxxxxxx (20 x’s after the =). With 20 x’s after the = the engine takes 555 steps to match! (Worse, if the x= was missing, so the string was just 20 x’s, the engine would take 4,067 steps to find the pattern doesn’t match).


This video shows all the backtracking necessary to match x=xxxxxxxxxxxxxxxxxxxx:


That’s bad because as the input size goes up the match time goes up super-linearly. But things could have been even worse with a slightly different regular expression. Suppose it had been .*.*=.*; (i.e. there’s a literal semicolon at the end of the pattern). This could easily have been written to try to match an expression like foo=bar;.

This time the backtracking would have been catastrophic. To match x=x takes 90 steps instead of 23. And the number of steps grows very quickly. Matching x= followed by 20 x’s takes 5,353 steps. Here’s the corresponding chart. Look carefully at the Y-axis values compared the previous chart.

To complete the picture here are all 5,353 steps of failing to match x=xxxxxxxxxxxxxxxxxxxx against .*.*=.*;


Using lazy rather than greedy matches helps control the amount of backtracking that occurs in this case. If the original expression is changed to .*?.*?=.*? then matching x=x takes 11 steps (instead of 23) and so does matching x=xxxxxxxxxxxxxxxxxxxx. That’s because the ? after the .* instructs the engine to match the smallest number of characters first before moving on.

But laziness isn’t the total solution to this backtracking behaviour. Changing the catastrophic example .*.*=.*; to .*?.*?=.*?; doesn’t change its run time at all. x=x still takes 555 steps and x= followed by 20 x’s still takes 5,353 steps.

The only real solution, short of fully re-writing the pattern to be more specific, is to move away from a regular expression engine with this backtracking mechanism. Which we are doing within the next few weeks.

The solution to this problem has been known since 1968 when Ken Thompson wrote a paper titled “Programming Techniques: Regular expression search algorithm”. The paper describes a mechanism for converting a regular expression into an NFA (non-deterministic finite automata) and then following the state transitions in the NFA using an algorithm that executes in time linear in the size of the string being matched against.

Thompson’s paper doesn’t actually talk about NFA but the linear time algorithm is clearly explained and an ALGOL-60 program that generates assembly language code for the IBM 7094 is presented. The implementation may be arcane but the idea it presents is not.

Here’s what the .*.*=.* regular expression would look like when diagrammed in a similar manner to the pictures in Thompson’s paper.


Figure 0 has five states starting at 0. There are three loops which begin with the states 1, 2 and 3. These three loops correspond to the three .* in the regular expression. The three lozenges with dots in them match a single character. The lozenge with an = sign in it matches the literal = sign. State 4 is the ending state, if reached then the regular expression has matched.

To see how such a state diagram can be used to match the regular expression .*.*=.* we’ll examine matching the string x=x. The program starts in state 0 as shown in Figure 1.


The key to making this algorithm work is that the state machine is in multiple states at the same time. The NFA will take every transition it can, simultaneously.

Even before it reads any input, it immediately transitions to both states 1 and 2 as shown in Figure 2.


Looking at Figure 2 we can see what happened when it considers  first x in x=x. The x can match the top dot by transitioning from state 1 and back to state 1. Or the x can match the dot below it by transitioning from state 2 and back to state 2.

So after matching the first x in x=x the states are still 1 and 2. It’s not possible to reach state 3 or 4 because a literal = sign is needed.

Next the algorithm considers the = in x=x. Much like the x before it, it can be matched by either of the top two loops transitioning from state 1 to state 1 or state 2 to state 2, but additionally the literal = can be matched and the algorithm can transition state 2 to state 3 (and immediately state 4). That’s illustrated in Figure 3.

Next the algorithm reaches the final x in x=x. From states 1 and 2 the same transitions are possible back to states 1 and 2. From state 3 the x can match the dot on the right and transition back to state 3.

At that point every character of x=x has been considered and because state 4 has been reached the regular expression matches that string. Each character was processed once so the algorithm was linear in the length of the input string. And no backtracking was needed.

It might also be obvious that once state 4 was reached (after x= was matched) the regular expression had matched and the algorithm could terminate without considering the final x at all.

This algorithm is linear in the size of its input.







07:45

Saturday Morning Breakfast Cereal - Meaning [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Are all non-human creatures technically nihilists?


Today's News:

02:00

What is Silverblue? [Fedora Magazine]

Fedora Silverblue is becoming more and more popular inside and outside the Fedora world. So based on feedback from the community, here are answers to some interesting questions about the project. If you do have any other Silverblue related questions, please leave it in the comments section and we will try to answer them in a future article.

What is Silverblue?

Silverblue is a codename for the new generation of the desktop operating system, previously known as Atomic Workstation. The operating system is delivered in images that are created by utilizing the rpm-ostree project. The main benefits of the system are speed, security, atomic updates and immutability.

What does “Silverblue” actually mean?

“Team Silverblue” or “Silverblue” in short doesn’t have any hidden meaning. It was chosen after roughly two months when the project, previously known as Atomic Workstation was rebranded. There were over 150 words or word combinations reviewed in the process. In the end Silverblue was chosen because it had an available domain as well as the social network accounts. One could think of it as a new take on Fedora’s blue branding, and could be used in phrases like “Go, Team Silverblue!” or “Want to join the team and improve Silverblue?”.

What is ostree?

OSTree or libostree is a project that combines a “git-like” model for committing and downloading bootable filesystem trees, together with a layer to deploy them and manage the bootloader configuration. OSTree is used by rpm-ostree, a hybrid package/image based system that Silverblue uses. It atomically replicates a base OS and allows the user to “layer” the traditional RPM on top of the base OS if needed.

Why use Silverblue?

Because it allows you to concentrate on your work and not on the operating system you’re running. It’s more robust as the updates of the system are atomic. The only thing you need to do is to restart into the new image. Also, if there’s anything wrong with the currently booted image, you can easily reboot/rollback to the previous working one, if available. If it isn’t, you can download and boot any other image that was generated in the past, using the ostree command.

Another advantage is the possibility of an easy switch between branches (or, in an old context, Fedora releases). You can easily try the Rawhide or updates-testing branch and then return back to the one that contains the current stable release. Also, you should consider Silverblue if you want to try something new and unusual.

What are the benefits of an immutable OS?

Having the root filesystem mounted read-only by default increases resilience against accidental damage as well as some types of malicious attack. The primary tool to upgrade or change the root filesystem is rpm-ostree.

Another benefit is robustness. It’s nearly impossible for a regular user to get the OS to the state when it doesn’t boot or doesn’t work properly after accidentally or unintentionally removing some system library. Try to think about these kind of experiences from your past, and imagine how Silverblue could help you there.

How does one manage applications and packages in Silverblue?

For graphical user interface applications, Flatpak is recommended, if the application is available as a flatpak. Users can choose between Flatpaks from either Fedora and built from Fedora packages and in Fedora-owned infrastructure, or Flathub that currently has a wider offering. Users can install them easily through GNOME Software, which already supports Fedora Silverblue.

One of the first things users find out is there is no dnf preinstalled in the OS. The main reason is that it wouldn’t work on Silverblue — and part of its functionality was replaced by the rpm-ostree command. Users can overlay the traditional packages by using the rpm-ostree install PACKAGE. But it should only be used when there is no other way. This is because when the new system images are pulled from the repository, the system image must be rebuilt every time it is altered to accommodate the layered packages, or packages that were removed from the base OS or replaced with a different version.

Fedora Silverblue comes with the default set of GUI applications that are part of the base OS. The team is working on porting them to Flatpaks so they can be distributed that way. As a benefit, the base OS will become smaller and easier to maintain and test, and users can modify their default installation more easily. If you want to look at how it’s done or help, take a look at the official documentation.

What is Toolbox?

Toolbox is a project to make containers easily consumable for regular users. It does that by using podman’s rootless containers. Toolbox lets you easily and quickly create a container with a regular Fedora installation that you can play with or develop on, separated from your OS.

Is there any Silverblue roadmap?

Formally there isn’t any, as we’re focusing on problems we discover during our testing and from community feedback. We’re currently using Fedora’s Taiga to do our planning.

What’s the release life cycle of the Silverblue?

It’s the same as regular Fedora Workstation. A new release comes every 6 months and is supported for 13 months. The team plans to release updates for the OS bi-weekly (or longer) instead of daily as they currently do. That way the updates can be more thoroughly tested by QA and community volunteers before they are sent to the rest of the users.

What is the future of the immutable OS?

From our point of view the future of the desktop involves the immutable OS. It’s safest for the user, and Android, ChromeOS, and the last macOS Catalina all use this method under the hood. For the Linux desktop there are still problems with some third party software that expects to write to the OS. HP printer drivers are a good example.

Another issue is how parts of the system are distributed and installed. Fonts are a good example. Currently in Fedora they’re distributed in RPM packages. If you want to use them, you have to overlay them and then restart to the newly created image that contains them.

What is the future of standard Workstation?

There is a possibility that the Silverblue will replace the regular Workstation. But there’s still a long way to go for Silverblue to provide the same functionality and user experience as the Workstation. In the meantime both desktop offerings will be delivered at the same time.

How does Atomic Workstation or Fedora CoreOS relate to any of this?

Atomic Workstation was the name of the project before it was renamed to Fedora Silverblue.

Fedora CoreOS is a different, but similar project. It shares some fundamental technologies with Silverblue, such as rpm-ostree, toolbox and others. Nevertheless, CoreOS is a more minimal, container-focused and automatically updating OS.

Thursday, 11 July

19:33

Firefox 68 available now in Fedora [Fedora Magazine]

Earlier this week, Mozilla released version 68 of the Firefox web browser. Firefox is the default web browser in Fedora, and this update is now available in the official Fedora repositories.

This Firefox release provides a range of bug fixes and enhancements, including:

  • Better handling when using dark GTK themes (like Adwaita Dark). Previously, running a dark theme may have caused issues where user interface elements on a rendered webpage (like forms) are rendered in the dark theme, on a white background. Firefox 68 resolves these issues. Refer to these two Mozilla bugzilla tickets for more information.
  • The about:addons special page has two new features to keep you safer when installing extensions and themes in Firefox. First is the ability to report security and stability issues with addons directly in the about:addons page. Additionally, about:addons now has a list of secure and stable extensions and themes that have been vetted by the Recommended Extensions program.

Updating Firefox in Fedora

Firefox 68 has already been pushed to the stable Fedora repositories. The security fix will be applied to your system with your next update. You can also update the firefox package only by running the following command:

$ sudo dnf update --refresh firefox

This command requires you to have sudo setup on your system. Additionally, note that not every Fedora mirrors syncs at the same rate. Community sites graciously donate space and bandwidth these mirrors to carry Fedora content. You may need to try again later if your selected mirror is still awaiting the latest update.

07:59

Saturday Morning Breakfast Cereal - Peak [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I've done it. I found something even lazier than graph jokes.


Today's News:

07:03

The Network is the Computer: A Conversation with John Gage [The Cloudflare Blog]

The Network is the Computer: A Conversation with John Gage
The Network is the Computer: A Conversation with John Gage

To learn more about the origins of The Network is the Computer®, I spoke with John Gage, the creator of the phrase and the 21st employee of Sun Microsystems. John had a key role in shaping the vision of Sun and had a lot to share about his vision for the future. Listen to our conversation and read the full transcript below (or  click here to open in a new window).


[00:00:13]

John Graham-Cumming: I’m talking to John Gage who was what, the 21st employee of Sun Microsystems, which is what Wikipedia claims and it also claims that you created this phrase “The Network is the Computer,” and that's actually one of the things I want to talk about with you a little bit because I remember when I was in Silicon Valley seeing that slogan plastered about the place and not quite understanding what it meant. So do you want to tell me what you meant by it or what Sun meant by it at the time?

[00:00:40]

John Gage: Well, in 2019, recalling what it meant in 1982 or 83’ will be colored by all our experience since then but at the time it seemed so obvious that when we introduced the first scientific workstations, they were not very powerful computers. The first Suns had a giant screen and they were on the Internet but they were designed as a complementary component to supercomputers. Bill Joy and I had a series of diagrams for talks we’d give, and Bill had the bi-modal, the two node picture. The serious computing occurred on the giant machines where you could fly into the heart of a black hole and the human interface was the workstation across the network. So each had to complement the other, each built on the strengths of the other, and each enhanced the other because to deal in those days with a supercomputer was very ugly. And to run all your very large computations, you could run them on a Sun because we had virtual memory and series of such advanced things but not fast. So the speed of scientific understanding is deeply affected by the tools the scientist has — is it a microscope, is it an optical telescope, is it a view into the heart of a star by running a simulation on a supercomputer? You need to have the loop with the human and the science constantly interacting and constantly modifying each other, and that’s what the network is for, to tie those different nodes together in as seamless a way as possible. Then, the instant anyone that’s ever created a programming language says, “so if I have to create a syntax of this where I’m trying to let you express, do this, how about the delay on the network, the latency”? Does your phrase “The Network is the Computer” really capture this hundreds, thousands, tens of thousands, millions perhaps at that time, now billions and billions and billions today, all these devices interacting and exchanging state with latency, with delay. It’s sort of an oversimplification, and that we would point out, but it’s just network is the computer. Four words, you know, what we tried to do is give a metaphor that allows you to explore it in your mind and think of new things to do and be inspired.

[00:03:35]

Graham-Cumming: And then by a sort of strange sequence of events, that was a trademark of Sun. It got abandoned. And now Cloudflare has swooped in and trademarked it again. So now it's our trademark which sort of brings us full circle, I suppose.

[00:03:51]

Gage: Well, trademarks are dealing with the real world, but the inspiration of Cloudflare is to do exactly what Bill Joy and I were talking about in 1982. It's to build an environment in which every participant globally can share with security, and we were not as strong. Bill wrote most of the code of TCP/IP implemented by every other computer vendor, and still these questions of latency, these questions of distributed denial of service which was, how do you block that? I was so happy to see that Cloudflare invests real money and real people in addressing those kinds of critical problems, which are at the core, what will destroy the Internet.

[00:14:48]

Graham-Cumming: Yes, I agree. I mean, it is a significant investment to actually deal with it and what I think people don't appreciate about the DDoS attack situation is that they are going on all the time and it's just a continuous, you know, just depends who the target is. It's funny you mentioned TCP/IP because about 10 years after, so in about ‘92, my first real job, I had to write a TCP/IP stack for an obscure network card. And this was prior to the Internet really being available everywhere. And so I didn't realize I could go and get the BSD implementation and recompile it. So I did it from scratch from the RFCs.

[00:05:23]

Gage: You did!

[00:05:25]

Graham-Cumming: And the thing I recommend here is that nobody ever does that because, you know, the real world, real code that really interacts is really hard when you're trying to work it with other things, so.

[00:05:36]

Gage: Do you still, John, do you have that code?

[00:05:42]

Graham-Cumming: I wonder. I have the binary for it.

[00:05:46]

Gage: Do hunt for it, because our story was at the time DARPA, the Defense Advanced Research Projects Agency, that had funded networking initiatives around the world. I just had a discussion yesterday with Norway and they were one of the first entities to implement using essentially Bill Joy’s code, but to be placed on the ARPANET. And a challenge went out, and at that time the slightly older generation, the Bolt Beranek and Newman Group, Vint Cerf, Bob Con, those names, as Vint Cerf was a grad student at UCLA where he had built one of the four first Internet sites and the DARPA offices were in Arlington, Virginia, they had massive investments in detection of nuclear underground tests, so seismological data, and the moment we made the very first Suns, I shipped them to DARPA, we got the network up and began serving seismic data globally. Really lovely visualization of events. If you’re trying to detect something, those things go off and then there’s a distinctive signature, a collapse of the underground cavern after. So DARPA had tried to implement, as you did, from the spec, from the RFC, the components, and Vint had designed a lot of this, all the acknowledgement codes and so forth that you had to implement in TCP/IP. So Bill, as a graduate student at Berkeley, we had a meeting in Arlington at DARPA headquarters where BBN and AT&T Bell Labs and a number of other people were in the room. Their code didn’t work, this graduate student from Berkeley named Bill Joy, his code did work, and when Bob Kahn and Vint Cerf asked Bill, “Well, so how did you do it?” What he said was exactly what you just said, he said, “I just read the spec and wrote the code.”

[00:08:12]

Graham-Cumming: I do remember very distinctly because the company I was working at didn’t have a TCP/IP stack and we didn’t have any IP machines, right, we were doing actually stuff that was all IBM networking, SMA stuff. Somehow we bought what was at that point a HP machine, it was an Apollo workstation and a Sun workstation. I had them on Ethernet and talking to each other. And I do distinctly remember the first time a ping packet came back from that Sun box, saying, yes I managed to send you an IP packet, you managed to send me ICMP response and that was pretty magical. And then I got to TCP and that was hard.

[00:08:55]

Gage: That was hard. Yeah. When you get down to the details, the spec can be wrong. I mean, it will want you to do something that’s a stupid thing to do. So Bill has such good taste in these things. It would be interesting to do a kind of a diff across the various implementations of the stack. Years and years later we had maybe 50 companies all assemble in a room, only engineers, throw out all the marketing people and all the Ps and VPs and every company in this room—IBM, Hewlett-Packard—oh my God, Hewlett-Packard, fix your TCP—and we just kept going until everybody could work with everybody else in sort of a pact. We’re not going to reveal, Honeywell, that you guys were great with earlier absolute assembly code, determinate time control stuff but you have no clue about how packets work, we’ll help you, so that all of us can make every machine interoperate, which yielded the network show, Interop. Every year we would go put a bunch of fiber inside whatever, you know, Geneva, or pick some, Las Vegas, some big venue.

[00:10:30]

Graham-Cumming: I used to go to Vegas all the time and that was my great introduction to Vegas was going there for Interop, year after year.

[00:10:35]

Gage: Oh, you did! Oh, great.

[00:10:36]

Graham-Cumming: Yes, yes, yes.

[00:10:39]

Gage: You know in a way, what you’re doing with, for example, just last week with the Verizon problem, everybody implementing what you’re doing now that is not open about their mistakes and what they’ve learned and is not sharing this, it’s a problem. And your global presence to me is another absolutely critical thing. We had about, I forget, 600 engineers in Beijing at the East Gate of Tsinghua a lot of networking expertise and lots of those people are at Tencent and Huawei and those network providers throughout the rest of the world, politics comes and goes but the engineering has to be done in a way that protects us. And so these conversations globally are critical.

[00:11:33]

Graham-Cumming: Yes, that's one of the things that’s fascinating actually about doing real things on the real Internet is there is a global community of people making computers talk to each other and you know, that it's a tremendously complicated thing to actually make that work, and you do it across countries, across languages. But you end up actually making them work, and that's the Internet we're sitting on, that you and I are talking on right now that is based on those conversations around the world.

[00:12:01]

Gage: And only by doing it do you understand more deeply how to do it. It’s very difficult in the abstract to say what should happen as we begin to spread. As Sun grew, every major city in Africa had installations and for network access, you were totally dependent on an often very corrupt national telco or the complications dealing with these people just to make your packet smooth. And as it turned out, many of the intelligence and military entities in all of these countries had very little understanding of any of this. That’s changed to some degree. But the dangerous sides of the Internet. Total surveillance, IPv6, complete control of exact identity of origins of packets. We implemented, let’s see, you had an early Sun. We probably completed our IPv6 implementation, was it still fluid in the 90s, but I remember 10 years after we finished a complete implementation of IPv6, the U.S. was still IPv4, it’s still IPv4.

[00:13:25]

Graham-Cumming: It still is, it still is. Pretty much. Except for the mobile carriers right now. I think in general the mobile phone operators are the ones who've gone more into IPv6 than anybody else.

[00:13:37]

Gage: It was remarkable in China. We used to have a conference. We’d bring a thousand Chinese universities into a room. Professor Wu from Tsinghua who built the Chinese Education and Research Network, CERNET. And now a thousand universities have a building on campus doing Internet research. We would get up and show this map of China and he kept his head down politically, but he managed at the point when there was a big fight between the Minister of Telecom and the Minister of Railways. The Minister of Railways said, look, I have continuity throughout China because I have railines. I’ve just made a partnership with the People’s Liberation Army, and they are essentially slave labor, and they’re going to dig the ditches, and I’m going to run fiber alongside the railways and I don’t care what you, the Minister of Telecommunications, has to say about it, because I own the territory. And that created a separate pathway for the backbone IPv6 network in China. Cheap, cheap, cheap, get everybody doing things.

[00:14:45]

Graham-Cumming: Yes, now of course in China that’s resulted in an interesting situation where you have China Telecom and China Unicom, who sort of cooperate with each other but they’re almost rivals which makes IP packets quite difficult to route inside China.

[00:14:58]

Gage: Yes exactly. At one point I think we had four hunks of China. Everyone was geographically divided. You know there were meetings going on, I remember the moment they merged the telecom ministry with the electronics ministry and since we were working with both of them, I walk in a room and there’s a third group, people I didn’t know, it turns out that’s the People’s Liberation Army.

[00:15:32]

Graham-Cumming: Yes, they’re part of the team. So okay, going back to this “Network is the Computer” notion. So you were talking about the initial things that you were doing around that, why is it that it's okay that Cloudflare has gone out and trademarked that phrase now, because you seem to think that we've got a leg to stand on, I guess.

[00:15:56]

Gage: Frankly, I’d only vaguely heard of Cloudflare. I’ve been working in areas, I’ve got a project in the middle of Nairobi in the slum where I’ve spent the last 15 years or so learning a lot about clean water and sewage treatment because we have almost 400,000 people in a very small area, biggest slum in East Africa. How can you introduce sanitary water and clean sewage treatment into a very, an often corrupt, a very difficult environment, and so that’s been a fascination of mine and I’ve been spending a lot of time. What's a computer person know about fluid dynamics and pathogens? There’s a lot to learn. So as you guys grew so rapidly, I vaguely knew of you but until I started reading your blog about post-quantum crypto and how do we devise a network in these resilient denial of service attacks and all these areas where you’re a growing company, it’s very hard to take time to do serious advanced research-level work on distributed computing and distributed security, and yet you guys are doing it. When Bill created Java, the subsequent step from Java for billions and billions of devices to share resources and share computations was something we call Genie which is a framework for validation of who you are, movement of code from device to device in a secure way, total memory control so that someone is not capable of taking over memory in your device as we’ve seen with Spectre and the failures of these billions of Intel chips out there that all have a flaw on take all branches parallel compute implementations. So the very hardware you’re using can be insecure so your operating systems are insecure, the hardware is insecure, and yet you’re trying to build on top with fallible pieces in infallible systems. And you’re in the middle of this, John, which I’m so impressed by.

[00:18:13]

Graham-Cumming: And Jini sort of lives on as called Apache River now. It moved away from Sun and into an Apache project.

[00:18:21]

Gage: Yes, very few people seem to realize that the name Apache is a poetic phrasing of “a patchy system.” We patch everything because everything is broken. We moved a lot of it, Brian Behlendorf and the Apache group. Well, many of the innovations at Sun, Java is one, file systems that are far more secure and far more resilient than older file systems, the SPARC  implementation, I think the SPARC processor, even though you’re using the new ARM processors, but Fujitsu, I still think keeps the SPARC architecture as the world’s fastest microprocessor.  

[00:19:16]

Graham-Cumming: Right. Yes. Being British of course, ARM is a great British success. So I'm honor-bound to use that particular architecture. Clearly.

[00:19:25]

Gage: Oh, absolutely. And the power. That was the one always in a list of what our engineering goals are. We wanted to make, we were building supercomputers, we were building very large file servers for the telcos and the banks and the intelligence agencies and all these different people, but we always wanted to make a low power and it just fell off the list of what you could accomplish and the ARM chips, their ratios of wattage to packets treated are—you have a great metric on your website someplace about measuring these things at a very low level—that’s key.

[00:20:13]

Graham-Cumming: Yes, and we had Sophie Wilson, who of course is one of the founders of ARM and actually worked on the original chip, tell this wonderful story at our Internet Summit about how the first chip they hooked up was operating fine until they realized they hadn't hooked the power up and they were asked to. It was so low power that it was able to use the power that was coming in over the logic lines to actually power the whole chip. And they said to me, wait a minute, we haven't plugged the power in but the thing is running, which was really, I mean that was an amazing achievement to have done that.

[00:20:50]

Gage: That’s amazing. We open sourced SPARC, the instruction set, so that anybody doing crypto that also had Fab capabilities could implement detection of ones and zeroes, sheep and goats, or other kinds of algorithms that are necessary for very high speed crypto. And that’s another aspect that I’m so impressed by Cloudflare. Cloudflare is paying attention at a machine instruction level because you’re implementing with your own hardware packages in what, 180 cities? You’re moving logistically a package into Ulan Bator, or into Mombasa and you’re coming up live.

[00:21:38]

Graham-Cumming: And we need that to be inexpensive and fast because we're promising people that we will make their Internet properties faster and secure at the same time and that's one of the interesting challenges which is not trading those two things off. Which means your crypto better be fast, for example, and that requires a lot of fiddling around at the hardware level and understanding it. In our case because we're using Intel, really what Intel chips are doing at the low level.

[00:22:10]

Gage: Intel did implement a couple of things in one or another of the more recent chips that were very useful for crypto. We had a group of the SPARC engineers, probably 30, at a dinner five or six months ago discussing, yes, we set the world standard for parallel execution branching optimizations for pipelines and chips, and when the overall design is not matched by an implementation that pays attention to protecting the memory, it’s a fundamental, exploitable flaw. So a lot of discussion about this. Selecting precisely which instructions are the most important, the risk analysis with the ability to make a chip specifically to implement a particular algorithm, there’s a lot more to go. We have multiples of performance ahead of us for specific algorithms based on a more fluid way to add instructions that are necessary into a specific piece of hardware. And then we jump to quantum. Oh my.

[00:23:32]

Graham-Cumming: Yes. To talk about that a little bit, the ever-increasing speed of processors and the things we can do; Do you think we actually need that given that we're now living in this incredibly distributed world where we are actually now running very distributed algorithms and do we really need beefier machines?

[00:23:49]

Gage: At this moment, in a way, it’s you making fun of Bill Joy for only wanting a megabit in Aspen. When Steve Jobs started NeXT, sadly his hardware was just terrible, so we sent a group over to boost NeXT. In fact we sort of secretly slipped him $30 million to keep him afloat. And I’d say, “Jobs, if you really understood something about hardware, it would really be useful here.” So one of the main team members that we sent over to NeXT came to live in Aspen and ended up networking the entire valley. At a point, megabit for what you needed to do, seemed reasonable, so at this moment, as things become alive by the introduction of a little bit of intelligence in them, some little flickering chip that’s able to execute an algorithm, many tasks don’t require. If you really want to factor things fast, quantum, quantum. Which will destroy our existing crypto systems. But if you are just bringing the billions of places where a little bit of knowledge can alter locally a little bit of performance, we could do very well with the compute power that we have right now. But making it live on the network, securely, that’s the key part. The attacks that are going on, simple errors as you had yesterday, are simple errors. In a way, across Cloudflare’s network, you’re watching the challenges of the 21st century take place: attacks, obscure, unknown exploits of devices in the power and water control systems. And so, you are in exactly the right spot to not get much sleep and feel a heavy responsibility.

[00:26:20]

Graham-Cumming: Well it certainly felt like it yesterday when we were offline for 27 minutes, and that’s when we suddenly discovered, we sort of know how many customers we have, and then we really discover when they start phoning us. Our support line had his own DDoS basically where it didn’t work anymore because so many people signed in. But yes, I think that it's interesting your point about a little bit extra on a device somewhere can do something quite magical and then you link it up to the network and you can do a lot. What we think is going on partly is some things around AI, where large amounts of machine learning are happening on big beefy machines, perhaps in the cloud, perhaps groups of machines, and then devices are doing their own little bits of inference or recognizing faces and stuff like that. And that seems to be an interesting future where we have these devices that are actually intelligent in our pockets.

[00:27:17]

Gage: Oh, I think that’s exactly right. There’s so much power in your pocket. I’m spending a lot of time trying to catch up that little bit of mathematics that you thought you understood so many years ago and it turns out, oh my, I need a little bit of work here. And I’ve been reading Michael Jordan’s papers and watching his talks and he’s the most cited computer scientist in machine learning and he will always say, “Be very careful about the use of the phrase, ‘Artificial Intelligence’.” Maybe it’s a metaphor like “The Network is the Computer.” But, we’re doing gradient descent optimization. Is the slope going up, or is the slope going down? That’s not smart. It’s useful and the real time language translation and a lot of incredible work can occur when you’re doing phrases. There’s a lot of great pattern work you can do, but he’s out in space essentially combining differentiation and integration in a form of integral. And off we go. Are your hessians rippling in the wind? And what’s the shape of this slope? And is this actually the fastest path from here to there to constantly go downhill. Maybe it’s sometimes going uphill and going over and then downhill that’s faster. So there’s just a huge amount of new mathematics coming in this territory and each time, as we move from 2G to 3G to 4G to 5G, many people don’t appreciate that the compression algorithms changed between 2G, 3G, 4G and 5G and as a result, so much more can move into your mobile device for the same amount of power. 10 or 20 times more for the same about of power. And mathematics leads to insights and applications of it. And you have a working group in that area, I think. I tried to probe around to see if you’re hiring.

[00:30:00]

Graham-Cumming: Well you could always just come around to just ask us because we'll probably tell you because we tend to be fairly transparent. But yes, I mean compression is definitely an area where we are interested in doing things. One of the things I first worked on at Cloudflare was a thing that did differential compression based on the insight that web pages don’t actually change that much when you hit ‘refresh’. And so it turns out that if you if you compress based on the delta from the last thing you served to someone you can actually send many orders of magnitude less data and so there's lots of interesting things you can do with that kind of insight to save a tremendous amount of bandwidth. And so yeah, definitely compression is interesting, crypto is interesting to us. We’ve actually open sourced some of our compression improvements in zlib which was very popular compression algorithm and now it's been picked up. It turns out that in neuroscience, because there's a tremendous amount of data which needs compression and there are pipelines used in neuroscience where actually having better compression algorithms makes you work a lot faster. So it's fascinating to see the sort of overspill of things we’re doing into other areas where I know nothing about what goes on inside the brain.

[00:31:15]

Gage: Well isn’t that fascinating, John. I mean here you are, the CTO of Cloudflare working on a problem that deeply affects the Internet, enabling a lot more to move across the Internet in less time with less power, and suddenly it turns into a tool for brain modeling and neuroscientists. This is the benefit. There’s a terrific initiative. I’m at Berkeley. The Jupiter notebooks created by Fernando Perez, this environment in which you can write text and code and share things. That environment, taken up by machine learning. I think it’s a major change. And the implementation of diagrams that are causal. These forms of analysis of what caused what. These are useful across every discipline and for you to model traffic and see patterns emerge and find webpages and see the delta has changed and then intelligently change the pattern of traffic in response to it, it’s all pretty much the same thing here.

[00:32:53]

Graham-Cumming: Yes and then as a mathematician, when I see things that are the same thing, I can't help wondering what the real deep structure is underneath. There must be another layer another layer down or something. So as you know it's this thing. There's some other deeper layer below all this stuff.  

[00:33:12]

Gage: I think this is just endlessly fascinating. So my only recommendations to Cloudflare: first, double what you’re doing. That’s so hard because as you go from 10 people to 100 people to 1,000 people to 10,000 people, it’s a different world. You are a prime example, you are global. Suddenly you’re able to deal with local authorities in 60-70 countries and deal with some of the world’s most interesting terrain and with network connectivity and moving data, surveillance, and some security of the foundation infrastructure of all countries. You couldn’t be engaged in more exciting things.

[00:34:10]

Graham-Cumming: It's true. I mean one of the most interesting things to me is that I have grown up with the Internet when I you know I got an email address using actually the crazy JANET scheme in the UK where the DNS names were backwards. I was in Oxford and they gave me an email address and it was I think it was JGC at uk dot ac dot ox dot prg and that then at some point it flipped around and it went to DNS looked like it had won. For a long time my address was the wrong way around. I think that's a typically British decision to be slightly different to everybody else.

[00:35:08]

Gage: Well, Oxford’s always had that style, that we’re going to do things differently. There’s an Oxford Center for the 21st century that was created by the money from a wonderful guy who had donated maybe $100 million. And they just branched out into every possible research area. But when you went to meetings, you would enter a building that was built at the time of the Raj. It was the India temple of colonialism.

[00:35:57]

Graham-Cumming: There's quite a few of those in the UK. Are you thinking of the Martin School? James Martin. And he gave a lot of money to Oxford. Well the funny thing about that was the programming research group. The one thing they didn't teach us really as an undergraduate was how to program which was one of the most fascinating things they have because that was a bit getting your hands dirty so you needed to let all the theory. So we learnt all the theory we did a little bit of functional programming and that was the extent of it which set me really up badly for a career in an industry. My first job I had to pretend I knew how to program and see and learn very quickly.

[00:36:42]

Gage: Oh my. Well now you’ve been writing code in Go.

[00:36:47]

Graham-Cumming: Yes. Well the thing about Go, the other Oxford thing of course is Tony Hoare, who is a professor of computer science there. He had come up with this thing called CSP (Communicating Sequential Processes) so that was a whole theory around how you do parallel execution. And so of course everybody used his formalism and I did in my doctoral thesis and so when Go came along and they said oh this how Go works, I said, well clearly that’s CSP and I know how to do this. So I can do it again.

[00:37:23]

Gage: Tony Hoare occasionally would issue a statement about something and it was always a moment. So few people seem to realize the birth of so much of what we took in the 60s, 70s, 80s, in Silicon Valley and Berkeley, derived from the Manchester Group, the virtual memory work, these innovations. Today, Whit Diffie. He used to love these Bletchley stories, they’re so far advanced. That generation has died off.

[00:38:37]

Graham-Cumming: There’s a very peculiar thing in computer science and the real application of computing which is that we both somehow sit on this great knowledge of the past of computing and at the same time we seem to willfully forget it and reinvent everything every few years. We go through these cycles where it's like, let’s do centralized computing, now distributed computing. No, let’s have desktop PCs, now let’s have the cloud. We seem to have this collective amnesia and then on occasion people go, “Oh, Leslie Lamport wrote this thing in 1976 about this problem”. What other subject do we willfully forget the past and then have to go and doing archaeology to discover again?

[00:39:17]

Gage: As a sociological phenomenon it means that the older crowd in a company are depressing because they’ll say, “Oh we tried that and it didn’t work”. Over the years as Sun grew from 15 people or so and ended up being like 45,000 people before we were sold off to Oracle and then everybody dumped out because Oracle didn’t know too much about computing. So Ivan Sutherland, Whit Diffie. Ivan actually stayed on. He may actually still have an Oracle email. Almost all of the research groups, certainly the chip group went off to Intel, Fujitsu, Microsoft. It’s funny to think now that Microsoft’s run by a Sun person.

[00:40:19]

Graham-Cumming: Well that's the same thing. Everyone’s forgotten that Microsoft was the evil empire not that long ago. And so now it’s not. Right now it’s cool again.

[00:40:28]

Gage: Well, all of the embedded stuff from Microsoft is still that legacy that Bill Gates who’s now doing wonderful things with the Gates Foundation. But the embedded insecurity of the global networks is due to, in large part, the insecurities, that horrible engineering of Microsoft embedded everywhere. You go anywhere in China to some old industrial facility and there is some old not updated junky PC running totally insecure software. And it’s controlling the grid. It’s discouraging. It’s like a lot of the SCADA systems.

[00:41:14]

Graham-Cumming: I’m completely terrified of SCADA systems.

[00:41:20]

Gage: The simplest exploits. I mean, it’s nothing even complicated. There are a series of emerging journalists today that are paying attention to cybersecurity and people have come out with books even very recently. Well, now because we’re in this China, US, Iran nightmare, a United States presidential directive taking the cybersecurity crowd and saying, oops, now you’re an offensive force. Which means we got some 20-year-old lieutenant somewhere who suddenly might just for fun turn off Tehran’s water supply or something. This is scary because the SCADA systems are embedded everywhere, and they’re, I don’t know, would you say totally insecure? Just the simple things, just simple exploits. One of the journalists described, I guess it was the Russians who took a bunch of small USB sticks and at a shopping center near a military base just gave them away. And people put them into their PCs inside SIPRNet, inside the secure U.S. Department of Defense network. Instantly the network was taken over just by inserting a USB device to something on the net. And there you are, John, protecting against this.

[00:43:00]

Graham-Cumming: Trying hard to protect against these things, yes absolutely. It's very interesting because you mentioned before how rapidly Cloudflare had grown over the last few years. And of course Sun also really got going pretty rapidly, didn’t it?

[00:43:00]

Gage: Well, yes. The first year we were just some students from Berkeley, hardware from Stanford, Andy Bechtolsheim, software from Berkeley, Berkeley Unix BSD, Bill Joy. Combine the two, and 10 of us or so, and we were, I think the first year was 12 million booked, the second year was 50 or 60 million booked, and the third year was 150 or so million booked and then we hit 500 million and then we hit a billion. And now, it’s selling boxes, we were a manufacturing company so that’s different from software or services, but we also needed lots of people and so we instantly raided the immense benefit of variety of people in the San Francisco Bay Area, with Berkeley and Stanford. We had students in computer science, and mechanical engineering, and physics, and mathematics from every country in the world and we recruited from every country in the world. So a great part of Sun’s growth came, as you are, expanding internationally, and at one point I think we ran most of the telcos of the world, we ran China Mobile. 900 million subscribers on China Mobile, all Sun stuff in the back. Throughout Africa, every telco was running Sun and Cisco until Huawei knocked Cisco out. It was an amazing time.

[00:44:55]

Graham-Cumming: You ran the machine that ran LaTeX, that let me get my doctoral thesis done.

[00:45:01]

Gage: You know that’s how I got into it, actually. I was in econometrics and mathematics at Berkeley, and I walk down a hallway and outside a room was that funny smell from photographic paper from something, and there was perfectly typeset mathematics. Troff and nroff, all those old UNIX utilities for the Bell Technical Journal, and I open the door and I’ve got to get in there. There’s two hundred people sitting in front of these beehive-like little terminals all typing away on a UNIX system. And I want to get an account and I walk down the hall and there's this skinny guy who types about 200 words a minute named Bill Joy. And I said, I need an account, I’ve got to type set integral signs, and he said, what’s your name. I tell him my name, John Gage, and he goes voop, and I’ve never seen anybody type as fast as him in my life. This is a new world, here.

[00:45:58]

Graham-Cumming: So he was rude then?

[00:46:01]

Gage: Yeah he was, he was. Well, it’s interesting since the arrival of a device at Berkeley to complement the arrival of an MIT professor who had implemented in LISP, mathematical, not typesetting of mathematics, but actual Macsyma. To get Professor Fateman, Macsyma god from MIT, to come to Berkeley and live a UNIX environment, we had to put a LISP up outside on the PDP. So Bill took that machine which had virtual memory and implemented the environment for significant computational mathematics. And Steve Wolfram took that CalTech, and Princeton Institute for Advanced Studies, and now we have Mathematica. So in a way, all of Sun and the UNIX world derived from attempting to do executable mathematics.

[00:47:17]

Graham-Cumming: Which in some ways is what computers are doing. I think one of the things that people don’t really appreciate is the extent to which all numbers underneath.

[00:47:28]

Gage: Well that’s just this discrete versus continuous problem that Michael Jordan is attempting to address. To my current total puzzlement and complete ignorance, is what in the world is symplectic integration? And how do Lyapunov functions work? Oh, no clue.

[00:47:50]

Graham-Cumming: Are we going to do a second podcast on that? Are you going to come back and teach us?

[00:47:55]

Gage: Try it. We’re on, you’re on, you’re on. Absolutely. But you’ve got to run a company.

[00:48:00]

Graham-Cumming: Well I've got some things to do. Yeah. But you can go do that and come tell us about it.

[00:48:05]

Gage: All right, Great John. Well it was terrific to talk to you.

[00:48:08]

Graham-Cumming: So yes it was wonderful speaking to you as well. Thank you for helping me dig up memories of when I was first fooling around with Sun Systems and, you know, some of the early days and of course “The Network is the Computer,” I'm not sure I fully yet understand quite the metaphor or even if maybe I do somehow deeply in my soul get it, but we’re going to try and make it a reality, whatever it is.

[00:48:30]

Gage: Well, I count it as a complete success, because you count as one of our successes because you‘re doing what you’re doing, therefore the phrase, “The Network is the Computer,” resides in your brain and when you get up in the morning and decide what to do, a little bit nudges you toward making the network work.

[00:48:51]

Graham-Cumming: I think that's probably true. And there's the dog, the dog is saying you've been yakking for an hour and now we better stop. So listen, thank you so much for taking the time. It was wonderful talking to you. You have a good day. Thank you very much.


Interested in hearing more? Listen to my conversations with Ray Rothrock and Greg Papadopoulos of Sun Microsystems:

To learn more about Cloudflare Workers, check out the use cases below:

  • Optimizely - Optimizely chose Workers when updating their experimentation platform to provide faster responses from the edge and support more experiments for their customers.
  • Cordial - Cordial used a “stable of Workers” to do custom Black Friday load shedding as well as using it as a serverless platform for building scalable customer-facing services.
  • AO.com - AO.com used Workers to avoid significant code changes to their underlying platform when migrating from a legacy provider to a modern cloud backend.
  • Pwned Passwords - Troy Hunt’s popular "Have I Been Pwned" project benefits from cache hit ratios of 94% on its Pwned Passwords API due to Workers.
  • Timely - Using Workers and Workers KV, Timely was able to safely migrate application endpoints using simple value updates to a distributed key-value store.
  • Quintype - Quintype was an eager adopter of Workers to cache content they previously considered un-cacheable and improve the user experience of their publishing platform.

07:02

The Network is the Computer: A Conversation with Ray Rothrock [The Cloudflare Blog]

The Network is the Computer: A Conversation with Ray Rothrock
The Network is the Computer: A Conversation with Ray Rothrock

Last week I spoke with Ray Rothrock, former Director of CAD/CAM Marketing at Sun Microsystems, to discuss his time at Sun and how the Internet has evolved. In this conversation, Ray discusses the importance of trust as a principle, the growth of Sun in sales and marketing, and that time he gave Vice President Bush a Sun demo. Listen to our conversation and read the full transcript below (or  click here to open in a new window).


[00:00:07]

John Graham-Cumming: Here I am very lucky to get to talk with Ray Rothrock who was I think one of the first investors in Cloudflare, a Series A investor and got the company a little bit of money to get going, but if we dial back a few earlier years than that, he was also at Sun as the Director of CAD/CAM Marketing. There is a link between Sun and Cloudflare. At least one, but probably more than one, which is that Cloudflare has recently trademarked, The Network is the Computer®. And that was a Sun trademark, wasn’t it?

[00:00:43]

Ray Rothrock: It was, yes.

[00:00:46]

Graham-Cumming: I talked to John Gage and I asked him about this as well and I asked him to explain to me what it meant. And I'm going to ask you the same thing because I remember walking around the Valley thinking, that sounds cool; I’m not sure I totally understand it. So perhaps you can tell me, was I right that it was cool, and what does it mean?

[00:01:06]

Rothrock: Well it certainly was cool and it was extraordinarily unique at the time. Just some quick background. In those early days when I was there, the whole concept of networking computers was brand new. Our competitor Apollo had a proprietary network but Sun chose to go with TCP/IP which was a standard at the time but a brand new standard that very few people know about right. So when we started connecting computers and doing some intensive computing which is what I was responsible for—CAD/CAM in those days was extremely intensive whether it was electrical CAD/CAM, or mechanical CAD/CAM, or even simulation solid design modeling and things—having a little extra power from other computers was a big deal. And so this concept of “The Network is the Computer” essentially said that you had one window into the network through your desktop computer in those days—there was no mobile computing at that time, this was like 84’, 85’, 86’ I think. And so if you had the appropriate software you could use other people's computers (for CPU power) and so you could do very hard problems at that single computer could not do because you could offload some of that CPU to the other computers. Now that was very nerdy, very engineering intensive, and not many people did it. We’d go to the SIGGRAPH, which was a huge graphics show in those days and we would demonstrate ten Sun computers for example, doing some graphic rendering of a 3D wireframe that had been created in the CAD/CAM software of some sort. And it was, it was hard, and that was in the mechanical side. On the electrical side, Berkeley had some software that was called Magic—it’s still around and is a very popular EDA software that’s been incorporated in those concepts. But to imagine calculating the paths in a very complicated PCB or a very complicated chip, one computer couldn't do it, but Sun had the fundamental technology. So from my seat at Sun at the time, I had access to what could be infinite computing power, even though I had a single application running, and that was a big selling point for me when I was trying to convince EDA and MDA companies to put their software on the Sun. That was my job.

[00:03:38]

Graham-Cumming: And hearing it now, it doesn’t sound very revolutionary, because of course we’re all doing that now. I mean I get my phone out of my pocket and connect to goodness knows what computing power which does image recognition and spots faces and I can do all sorts of things. But walk me through what it felt like at the time.

[00:03:56]

Rothrock: Just doing a Google search, I mean, how many data stores are being spun up for that? At the time it was incredible, because you could actually do side by side comparisons. We created some demonstrations, where one computer might take ten hours to do a calculation, two computers might take three hours, five computers might take 30 minutes. So with this demo, you could turn on computers and we would go out on the TCP/IP network to look for an available CPU that could give me some time. Let's go back even further. Probably 15 years before that, we had time sharing. So you had a terminal into a big mainframe and did all this swapping in and out of stuff to give you a time slice computing. We were doing the exact same thing except we were CPU slicing, not just time slicing. That’s pretty nerdy, but that's what we did. And I had to work with the engineering department, with all these great engineers in those days, to make this work for a demo. It was so unique, you know, their eyes would get big. You remember Novell...

[00:05:37]

Graham-Cumming: I was literally just thinking about Novell because I actually worked on IPX and SPX networking stuff at the time. I was going to ask you actually, to what extent do you think TCP/IP was a very important part of this revolution?

[00:05:55]

Rothrock: It was huge. It was fundamentally huge because it was a standard, so it was available and if you implemented it, you didn’t have to pay for it. When Bob Metcalfe did Ethernet, it was on top of the TCP stack. Sun, in my memory, and I could be wrong, was the first company to put a TCP/IP stack on the computer. And so you just plugged in the back, an RJ45 into this TCP/IP network with a switch or a router on it and you were golden. They made it so simple and so cheap that you just did it. And of course if you give an engineer that kind of freedom and it opens up. By the way, as the marketing guy at Sun, this was my first non-engineering job. I came from a very technical world of nuclear physics into Sun. And so it was stunning, just stunning.

[00:06:59]

Graham-Cumming: It’s interesting that you mentioned Novell and then you mentioned Apollo before that and obviously IBM had SNA networking and there were attempts to do all those networking things. It's interesting that these open standards have really enabled the explosion of everything else we've seen and with everything that's going on in the Internet.

[00:07:23]

Rothrock: Sun was open, so to speak, but this concept of open source now that just dominates the conversation. As a venture capitalist, every deal I ever invested in had open source of some sort in it. There was a while when it was very problematic in an M&A event, but the world’s gotten used to it. So open, is very powerful. It's like freedom. It's like liberty. Like today, July 4th, it’s a big deal.

[00:07:52]

Graham-Cumming: Yes, absolutely. It’s just interesting to see it explode today because I spent a lot of my career looking at so many different networking protocols. The thing that really surprises me, or perhaps shouldn’t surprise me when you’ve got these open things, is that you harness so many people's intelligence that you just end up with something that’s just better. It seems simple.

[00:08:15]

Rothrock: It seems simple. I think part of the magic of Sun is that they made it easy. Easy is the most powerful thing you can do in computing. Computing can be so nerdy and so difficult. But if you just make it easy, and Cloudflare has done a great job with that at that; they did it with their DNS service, they did it with all the stuff we worked on back when I was on the board and actively involved in the company. You’ve got to make it easy. I mean, I remember when Matthew and Lee worked like 20 hours a day on how to switch your DNS from whoever your provider was to Cloudflare. That was supposed to be one click, done. A to B. And that DNA was part of the magic. And whether we agree that Sun did it that way, to me at least, Sun did it that way as well. So it's huge, a huge lift.

[00:09:08]

Graham-Cumming: It’s funny you talk about that because at the time, how that actually worked is that we just asked people to give us their username and password. And we logged in and did it for them. Early on, Matthew asked me if I’d be interested in joining Cloudflare when it was brand new and because of other reasons I’d moved back to the UK and I wasn’t ready to change jobs and I’d just taken another job. And I remember thinking, this thing is crazy this Cloudflare thing. Who's going to hand over their DNS and their traffic to these four or five people above a nail salon in Palo Alto? And Matthew’s response was, “They’re giving us their passwords, let alone their traffic.” Because they were so desperate for it.

[00:09:54]

Rothrock: It tells you a lot about Matthew and you know as an attorney, I mean he was very sensitive to that and believes that one of the one of the founding principles is trust. His view was that, if I ever lose the customer’s trust, Cloudflare is toast. And so everything focused around that key value. And he was right.

[00:10:18]

Graham-Cumming: And you must have, at Sun, been involved with some high performance computing things that involved sensitive customers doing cryptography and things like that. So again trust is another theme that runs through there as well.

[00:10:33]

Rothrock: Yeah, very true. As the marketing guy of CAD/CAM, I was in the field two-thirds of the time, showing customers what was possible with them. My job was to get third party software onto the Sun box and then to turn that into a presentation to a customer. So I visited many government customers, many aerospace, power, all these very high falutin sort of behind the firewall kinds of guys in those days. So yes, trust was huge. It would come up: “Okay, so I’m using your CPU, how is it that you can’t use mine. And how do you convince me that you've not violated something.” In those days it was a whole different conversation that it is today but it was nonetheless just as important. In fact I remember I spent quite a bit of time at NCSA at the University of Illinois Urbana-Champaign. Larry Smarr was the head of NCSA. We spent a lot of time with Larry. I think John was there with me. John Gage and Vinod and some others but it was a big deal taking about high performance computing because that's what they were doing and doing it with Sun.

[00:11:50]

Graham-Cumming: So just to dial forward, so you’re at Venrock and you decide to invest in Cloudflare. What was it that made you think that this was worth investing in? Presumably you saw some things that were in some of Sun’s vision. Because Sun had a very wide-ranging visions about what was going to be possible with computing.

[00:12:11]

Rothrock: Yeah. Let me sort of touch on a few points probably. Certainly Sun was my first computer company I worked for after I got out of the nuclear business and the philosophy of the company was very powerful. Not only we had this cool 19 inch black and white giant Macintosh essentially although the Mac wasn't even born yet, but it had this ease of use that was powerful and had this open, I mean it was we preached that all the time and we made that possible. And Cloudflare—the related philosophy of Matthew and Michelle's genius—was they wanted to make security and distribution of data as free and easy as possible for the long tail. That was the first thinking because you didn't have access if you were in the long tail you were a small company you or you're just going to get whipped around by the big boys. And so there was a bit of, “We're here to help you, we're going to do it.” It's a good thing that the long tail get mobilized if you will or emboldened to use the Internet like the big boys do. And that was part of the attractiveness. I didn't say, “Boy, Matthew, this sounds like Sun,” but the concept of open and liberating which is what they were trying to do with this long tail DNS and CDN stuff was very compelling and seemed easy. But nothing ever is. But they made it look easy.

[00:13:52]

Graham-Cumming: Yeah, it never is. One of the parallels that I’ve noticed is that I think early on at Sun, a lot of Sun equipment went to companies that later became big companies. So some of these small firms that were using crazy work stations ended up becoming some of the big names in the Valley. To your point about the long tail, they were being ignored and couldn’t buy from IBM even if they wanted to.

[00:14:25]

Rothrock: They couldn’t afford SNA and they couldn’t do lots of things. So Sun was an enabler for these companies with cool ideas for products and software to use Sun as the underpinning. workstations were all the rage, because PCs were very limited in those days. Very very limited, they were all Intel based. Sun was 68000-based originally and then it was their own stuff, SPARC. You know in the beginning it was a cheap microprocessor from Motorola.

[00:15:04]

Graham-Cumming: What was the growth like at Sun? Because it was very fast, right?

[00:15:09]

Rothrock: Oh yes, it was extraordinarily fast. I think I was employee 130 or something like that. I left Sun in 1986 to go to business school and they gave me a leave of absence. Carol Bartz was my boss at that moment. The company was like at 2000 people just two and a half years later. So it was growing like a weed. I measured my success by how thick the catalyst—that was our catalog name and our program—how thick and how quickly I could add bonafide software developers to our catalog. We published on one sheet of paper front to back. When I first got there, our catalyst catalogue was a sheet of paper, and when I left, it was a book. It was about three-quarters of an inch thick. My group grew from me to 30 people in about a year and a half. It was extraordinary growth. We went public during that time, had a lot of capital and a lot of buzz. That openness, that our competition was all proprietary just like you were citing there, John. IBM and Apollo were all proprietary networks. You could buy a NIC card and stick it into your PC and talk to a Sun. And vise versa. And you couldn’t do that with IBM or Apollo. Do you remember those?

[00:16:48]

Graham-Cumming: I do because I was talking to John Gage. In my first job out of college, I wrote a TCP/IP stack from scratch, for a manufacturer of network cards. The test of this stack was I had an HP Apollo box and I had a Sun workstation and there was a sort of magical, can I talk to these devices? And can I ping them? And then that was already magical the first ping as it went across the network. And then, can I Telnet to one of these? So you know, getting the networking actually running was sort of the key thing. How important was networking for Sun in the early days? Was it always there?

[00:17:35]

Rothrock: Yeah, it was there from the beginning, the idea of having a network capability. When I got there it was network; the machine wasn’t standalone at all. We sort of mimicked the mainframe world where we had green screens hooked into a Sun in a department for example. And there was time sharing. But as soon as you got a Sun on your desk, which was rare because we were shipping as many as we could build, it was fantastic. I was sharing information with engineering and we were working back and forth on stuff. But I think it was fundamental: you have a microprocessor, you’ve got a big screen, you’ve got a graphic UI, and you have a network that hooks into the greater universe. In those days, to send an all-Sun email around the world, modems spun up everywhere. The network wasn’t what it is now.

[00:18:35]

Graham-Cumming: I remember in about 89’, I was at a conference and Whit Diffie was there. I asked him what he was doing. He was in a little computer room. I was trying to typeset something. And he said, “I’m telnetting into a machine which is in San Diego.” It was the first time I’d seen this and I stepped over and he was like, “look at this.” And he’s hitting the keyboard and the keys are getting echoed back. And I thought, oh my goodness, this is incredible. It’s right across the Atlantic and across the country as well.

[00:19:10]

Rothrock: I think, and this is just me talking having lived the last years and with all the investing and stuff I did, but you know it enabled the Internet to come about, the TCP/IP standard. You may recall that Microsoft tried to modify the TCP/IP stack slightly, and the world rejected it, because it was just too powerful, too pervasive. And then along comes HTTP and all the other protocols that followed. Telnetting, FTPing, all that file transfer stuff, we were doing that left, right, and center back in the 80s. I mean you know Cloudflare just took all this stuff and made it better, easier, and literally lower friction. That was the core investment thesis at the time and it just exploded. Much like when Sun adopted TCP/IP, it just exploded. You were there when it happened. My little company that I’m the CEO of now, we use Cloudflare services. First thing I did when I got there was switched to Cloudflare.

[00:20:18]

Graham-Cumming: And that was one of the things when I joined, we really wanted people get to a point where if you’re putting something on the web, you just say, well I’m going to put Cloudflare or a thing like Cloudflare just on it. Because it protects it, it makes it faster, etc. And of course now what we've done is we’ve given people compute facility. Right now you can write code and run it in our in our machines worldwide which is another whole thing.

[00:20:43]

Rothrock: And that is “The Network is the Computer”. The other thing that Sun was pitching then was a paperless office. I remember we had posters of paper flying out of a computer window on a Sun workstation and I don't think we've gotten there yet. But certainly, the network is the computer.

[00:21:04]

Graham-Cumming: It was probably the case that the paperless office was one of those things that was about to happen for quite a long time.

[00:21:14]

Rothrock: It's still about to happen if you ask me. I think e-commerce and the sort of the digital transformation has driven it harder than just networking. You know, the fact that we can now sign legal documents over the Internet without paper and things like that. People had to adopt. People have to trust. People have to adopt these standards and accept them. And lo and behold we are because we made it easy, we made it cheap, and we made it trustworthy.

[00:21:42]

Graham-Cumming: If you dial back through Sun, what was the hardest thing? I’m asking because I’m at a 1,000-person company and it feels hard some days, so I’m curious. What do I need to start worrying about?

[00:22:03]

Rothrock: Well yeah, at 1,000 people, I think that’s when John came into the company and sort of organized marketing. I would say, holding engineering to schedules; that was hard. That was hard because we were pushing the envelope our graphics was going from black and white to color. The networking stuff the performance of all the chips into the boards and just the performance was a big deal. And I remember, for me personally, I would go to a trade show. I'd go to Boston to the Association of Mechanical Engineers with the team there and would show up at these workstations and of course the engineers want to show off the latest. So I would be bringing with me tapes that we had of the latest operating system. But getting the engineers to be ready for a tradeshow was very hard because they were always experimenting. I don't believe the word “code freeze” meant much to them, frankly, but we would be downloading the software and building a trade show thing that had to run for three days on the latest and greatest and we knew our competitor would be there right across the aisle from us sort of showing their hot stuff. And working with Eric Schmidt in those days, you know, Eric you just got to be done on this date. But trade shows were wonderful. They focused the company’s endpoints if you will. And marketing and sales drove Sun; Scott McNealy’s culture there was big on that. But we had to show. It’s different today than it was then, I don’t know about the Cloudflare competition, but back then, there were a dozen workstation companies and we were fighting for mindshare and market share every day. So you didn't dare sort of leave your best jewels at home. You brought them with you. I will give John Gage high, high marks. He showed me how to dance through a reboot in case the code crashed and he’s marvelous and I learned how to work that stuff and to survive.

[00:24:25]

Rothrock: Can I tell you one sort of sales story?  

[00:24:28]

Graham-Cumming: Yes, I’m very interested in hearing the non-technical stories. As an engineer, I can hear engineering stories all the time, but I’m curious what it was like being in sales and marketing in such an engineering heavy company as Sun.

[00:24:48]

Rothrock: Yeah. Well it was challenging of course. One of the strategies that Sun had in those days was to get anyone who was building their own computer. This was Computer Vision and Data General and all those guys to adopt the Sun as their hardware platform and then they could put on whatever they wanted. So because I was one of the demo gods, my job was to go along with the sales guys when they wanted to try to convince somebody. So one of the companies we went after was Data General (DG) in Massachusetts. And so I worked for weeks on getting this whole demo suite running MDA, EDA, word processing, I had everything. And this was a big, big, big deal. And I mean like hundreds of millions of dollars of revenue. And so I went out a couple of days early and we were going to put up a bunch of Suns and I had a demo room at DG. So all the gear showed up and I got there at like 5:30 in the morning and started downloading everything, downloading software, making it dance. And at about 8:00 a.m. in the morning the CEO of Data General walks in. I didn't know who he was but it turned out to be Ed de Castro. And he introduces himself and I didn’t know who he was and he said, “What are you doing?” And I explained, “I’m from Sun, I’m getting ready for a big demo. We’ve got a big executive presentation. Mr. McNealy will be here shortly, etc.” And he said, “Well, show me what you’ve got.” So I’m sort of still in the middle of downloading this software and I start making this thing dance. I’ve got these machines talking to each other and showing all kinds of cool stuff. And he left. And the meeting was about 10 or 11 in the morning. And so when the executive team from Sun showed up they said, “Well, how's it going?” I said, “Well I gave a demo to a guy,” and they asked, “Who's the guy,” and I said, “It was Ed de Castro.” And they went, “Oh my God, that was the CEO.” Well, we got the deal. I thought Ed had a little tactic there to come in early, see what he could see, maybe get the true skinny on this thing and see what’s real. I carried the day. But anyway, I got a nice little bonus for that. But Vinod and I would drop into Lockheed down in Southern California. They wanted to put Suns on P-3 airplanes and we'd go down there with an engineer and we’d figure out how to make it. Those were just incredible times. You may remember back in the 80s everyone dressed up except on Fridays. It was dress-down Fridays. And one day I dressed down and Carol Bartz, my boss, saw me wearing blue jeans and just an open collared shirt and she said, “Rothrock, you go home and put on a suit! You never know when a customer is going to walk in the front door.” She was quite right. Kodak shows up. Kodak made a big investment in Sun when it was still private. And I gave that demo and then AT&T, and then interestingly Vice President Bush back in the Reagan administration came to Sun to see the manufacturing and I gave the demo to the Vice President with Scott and Andy and Bill and Vinod standing there.

[00:28:15]

Graham-Cumming: Do you remember what he saw?

[00:28:18]

Rothrock: It was my standard two minute Sun demo that I can give in my sleep. We were on the manufacturing floor. We picked up a machine and I created a demo for it and my executive team was there. We have a picture of it somewhere, but it was fun. As John Gage would say, he’d say, “Ray, your job is to make the computer dance.” So I did.  

[00:28:44]

Graham-Cumming: And one of the other things I wanted to ask you about is at some point Sun was almost Amazon Web Services, wasn't it. There was a rent-a-computer service, right?  

[00:28:53]

Rothrock: I don't know. I don't remember the rent-a-computer service. I remember we went after the PC business aggressively and went after the data centers which were brand new in those days pretty aggressively, but I don’t remember the rent-a-computer business that much. It wasn’t in my domain.

[00:29:14]

Graham-Cumming: So what are you up to these days?

[00:29:18]

Rothrock: I’m still investing. I do a lot of security investing. I did 15 deals while I was at Venrock. Cloudflare was the last one I did, which turned out really well of course. More to come, I hope. And I’m CEO of one of Venrock’s portfolio companies that had a little trouble a few years back but I fixed that and it’s moving up nicely now. But I’ve started thinking about more of a science base. I’m on the board of the Carnegie Institute of Science. I'm on the board of MIT and I just joined the board of the Nuclear Threat Initiative in Washington which is run by Secretary Ernie Moniz, former secretary of energy. So I’m doing stuff like that. John would be pleased with how well that played through. But I'll tell you it is this these fundamental principles, just tying it all back to Sun and Cloudflare, and this sort of open, cheap, easy, enabling humans to do things without too much friction, that is exciting. I mean, look at your phone. Steve Jobs was the master of design to make this thing as sweet as it is.

[00:30:37]

Graham-Cumming: Yes, and as addictive.

[00:30:39]

Rothrock: Absolutely, right. I haven’t been to a presentation from Cloudflare in two years, but every time I see an announcement like the DNS service, I immediately switched all my DNS here at the house to 1.1.1.1. Stuff like that. Because I know it’s good and I know it’s trustworthy, and it’s got that philosophy built in the DNA.

[00:31:09]

Graham-Cumming: Yes definitely. Taking it back to what we talked about at the beginning, it’s definitely the trustworthiness is something that Cloudflare has cared about from the beginning and continues to care about. We’re sort of the guardians of the traffic that passes through it.

[00:31:25]

Rothrock: Back when the Internet started happening and when Sun was doing Java, I mean, all those things in the 90s, I was of course at Venrock, but I was still pretty connected to [Edward] Zander and [Scott] McNealy. We were hoping that it would be liberating, that it would create a world which was much more free and open to conversation and we’ve seen the dark side of some of that. But I continue to believe that transparency and openness is a good thing and we should never shut it down. I don't mean to get it all waxing philosophical here but way more good comes from being open and transparent than bad.

[00:32:07]

Graham-Cumming: Listen it's July 4th. It's evening here in London. We can be waxing philosophical as much as we like. Well listen, thank you for taking the time to chat with me. Are there any other reminiscences of Sun that you think the public needs to know in this oral history of “The Network is the Computer.”

[00:32:28]

Rothrock: Well you know the only thing I'd say is having landed in the Silicon Valley in 1981 and getting on with Sun, I can say this given my age and longevity here, everything is built on somebody else's great ideas. And starting with TCP/IP and then we went to this HTML protocol and browsers, it’s just layer on layer on layer on layer and so Cloudflare is just one of the latest to climb on the shoulders of those giants who put it all together. I mean, we don’t even think about the physical network anymore. But it is there and thank goodness companies like Cloudflare keep providing that fundamental service on which we can build interesting, cool, exciting, and mind-changing things. And without a Cloudflare, without Sun, without Apollo, without all those guys back in the day, it would be different. The world would just be so, so different. I did the New York Times crossword puzzle. I could not do it without Google because I have access to information I would not have unless I went to the library. It’s exponential and it just gets better. Thanks to Michelle and Matthew and Lee for starting Cloudflare and allowing Venrock to invest in it.

[00:34:01]

Graham-Cumming: Well thank you for being an investor. I mean, it helped us get off the ground and get things moving. I very much agree with you about the standing on the shoulders of giants because people don't appreciate the extent to which so much of this fundamental work that we did was done in the 70s and 80s.

[00:34:19]

Rothrock: Yea, it’s just like the automobile and the airplane. We reminisce about the history but boy, there were a lot of giants in those industries as well. And computing is just the latest.

[00:34:32]

Graham-Cumming: Yep, absolutely. Well, Ray, thank you. Have a good afternoon.


Interested in hearing more? Listen to my conversations with John Gage and Greg Papadopoulos of Sun Microsystems:

To learn more about Cloudflare Workers, check out the use cases below:

  • Optimizely - Optimizely chose Workers when updating their experimentation platform to provide faster responses from the edge and support more experiments for their customers.
  • Cordial - Cordial used a “stable of Workers” to do custom Black Friday load shedding as well as using it as a serverless platform for building scalable customer-facing services.
  • AO.com - AO.com used Workers to avoid significant code changes to their underlying platform when migrating from a legacy provider to a modern cloud backend.
  • Pwned Passwords - Troy Hunt’s popular "Have I Been Pwned" project benefits from cache hit ratios of 94% on its Pwned Passwords API due to Workers.
  • Timely - Using Workers and Workers KV, Timely was able to safely migrate application endpoints using simple value updates to a distributed key-value store. Quintype - Quintype was an eager adopter of Workers to cache content they previously considered un-cacheable and improve the user experience of their publishing platform.

07:01

The Network is the Computer: A Conversation with Greg Papadopoulos [The Cloudflare Blog]

The Network is the Computer: A Conversation with Greg Papadopoulos
The Network is the Computer: A Conversation with Greg Papadopoulos

I spoke with Greg Papadopoulos, former CTO of Sun Microsystems, to discuss the origins and meaning of The Network is the Computer®, as well as Cloudflare’s role in the evolution of the phrase. During our conversation, we considered the inevitability of latency, the slowness of the speed of light, and the future of Cloudflare’s newly acquired trademark. Listen to our conversation and read the full transcript below (or  click here to open in a new window).


[00:00:08]

John Graham-Cumming: Thank you so much for taking the time to chat with me. I've got Greg Papadopoulos who was CTO of Sun and is currently a venture capitalist. Tell us about “The Network is the Computer.”

[00:00:22]

Greg Papadopoulos: Well, from certainly a Sun perspective, the very first Sun-1 was connected via Internet protocols and at that time there was a big war about what should win from a networking point of view. And there was a dedication there that everything that we made was going to interoperate on the network over open standards, and from day one in the company, it was always that thought. It's really about the collection of these machines and how they interact with one another, and of course that puts the network in the middle of it. And then it becomes hard to, you know, where's the line? But it is one of those things that I think even if you ask most people at Sun, you go, “Okay explain to me ‘The Network is the Computer.’” It would get rather meta. People would see that phrase and sort of react to it in their own way. But it would always come back to something similar to what I had said I think in the earlier days.

[00:01:37]

Graham-Cumming: I remember it very well because it was obviously plastered everywhere in Silicon Valley for a while. And it sounded incredibly cool but I was never quite sure what it meant. It sounded like it was one of those things that was super deep but I couldn't dig deep enough. But it sort of seems like this whole vision has come true because if you dial back to I think it's 2006, you wrote a blog post about how the world was only going to need five or seven or some small number of computers. And that was also linked to this as well, wasn't it?

[00:02:05]

Papadopoulos: Yeah, I think as things began to evolve into what we would call cloud computing today, but that you could put substantial resources on the other side of the network and from the end user’s perspective and those could be as effective or more effective than something you'd have in front of you. And so this idea that you really could provide these larger scale computing services in early days — you know, grid was the term used before cloud — but if you follow that logic, and you watch what was happening to the improvements of the network. Dave Patterson at Cal was very fond of saying in that era and in the 90s, networks are getting to the place where the desk connected to another machine is transparent to you. I mean it could be your own, in fact, somebody else's memory may in fact be closer to you than your own disk. And that's a pretty interesting thought. And so where we ended up going was really a complete realization that these things we would call servers were actually just components of this network computer. And so it was very mysterious, “The Network is the Computer,” and it actually grew into itself in this way. And I'll say looking at Cloudflare, you see this next level of scale happening. It's not just, what are those things that you build inside a data center, how do you connect to it, but in fact, it's the network that is the computer that is the network.

[00:04:26]

Graham-Cumming: It's interesting though that there have been these waves of centralization and then push the computing power to the edge and the PCs at some point and then Larry Ellison came along and he was going to have this networked computer thing, and it sort of seems to swing back and forth, so where do you think we are in this swinging?

[00:04:44]

Papadopoulos: You know, I don't think so much swinging. I think it's a spiral upwards and we come to a place and we look down and it looks familiar. You know, where you'll say, oh I see, here's a 3270 connected to a mainframe. Well, that looks like a browser connected to a web server. And you know, here's the device, it’s connected to the web service. And they look similar but there are some very important differences as we're traversing this helix of sorts. And if you look back, for example the 3270, it was inextricably bound to a single server that was hosted. And now our devices have really the ability to connect to any other computer on the network. And so then I think we're seeing something that looks like a pendulum there, it’s really a refactoring question on what software belongs where and how hard is it to maintain where it is, and naturally I think that the Internet protocol clearly is a peer to peer protocol, so it doesn't take sides on this. And so that we end up in one state, with more on the client or less on the client. I think it really has to do with how well we've figured out distributed computing and how well we can deliver code in a management-free way. And that's a longer conversation.

[00:06:35]

Graham-Cumming: Well, it's an interesting conversation. One thing is what you talked about with Sun Grid which then we end up with Amazon Web Services and things like that, is that there was sort of the device, be it your handheld or your laptop talking to some cloud computing, and then what Cloudflare has done with this Workers product to say, well, actually I think there's three places where code could exist. There's something you can put inside the network.

[00:07:02]

Papadopoulos: Yes. And by extension that could grow to another layer too. And it goes back to, I think it's Dave Clark who I first remember saying you can get all the bandwidth you want, that's money, but you can't reduce latency. That's God, right. And so I think there are certainly things and as I see the Workers architecture, there are two things going on. There's clearly something to be said about latency there, and having distributed points of presence and getting closer to the clients. And there’s IBM with interaction there too, but it is also something that is around management of software and how we should be thinking in delivery of applications, which ultimately I believe, in the limit, become more distributed-looking than they are now. It's just that it's really hard to write distributed applications in kind of the general way we think about it.

[00:08:18]

Graham-Cumming: Yes, that's one of these things isn’t it, it is exceedingly hard to actually write these things which is why I think we're going through a bit of a transition right now where people are trying to figure out where that code should actually execute and what should execute where.

[00:08:31]

Papadopoulos: Yeah. You had graciously pointed out this blog from a dozen years ago on, hey this is inevitable that we're going to have this concentration of computing, for a lot of economic reasons as anything else. But it's both a hammer and a nail. You know, cloud stuff in some ways is unnatural in that why should we expect computing to get concentrated like it is. If you really look into it more deeply, I think it has to do with management and control and capital cycles and really things that are kind of on the economic and the administrative side of things, are not about what's truth and beauty and the destination for where applications should be.

[00:09:27]

Graham-Cumming: And I think you also see some companies are now starting to wrestle with the economics of the cloud where they realize that they are kind of locked into their cloud provider and are paying rent kind of thing; it becomes entirely economic at that point.

[00:09:41]

Papadopoulos: Well it does, and you know, this was also something I was pretty vocal about, although I got misinterpreted for a while there as being, you know, anti-cloud or something which I'm not, I think I'm pragmatic about it. One of the dangers is certainly as people yield particularly to SaaS products, that in fact, your data in many ways, unless you have explicit contracts and abilities to disgorge that data from that service, that data becomes more and more captive. And that's the part that I think is actually the real question here, which is like, what's the switching cost from one service to another, from one cloud to another.

[00:10:35]

Graham-Cumming: Yes, absolutely. That's one of the things that we faced, one of the reasons why we worked on this thing called the Bandwidth Alliance, which is one of the ways in which stuff gets locked into clouds is the egress fee is so large that you don't want to get your data out.

[00:10:50]

Papadopoulos: Exactly. And then there is always the, you know, well we have these particular features in our particular cloud that are very seductive to developers and you write to them and it's kind of hard to undo, you know, just the physics of moving things around. So what you all have been doing there is I think necessary and quite progressive. But we can do more.

[00:11:17]

Graham-Cumming: Yes definitely. Just to go back to the thought about latency and bandwidth, I have a jokey pair of slides where I show the average broadband network you can buy over time and it going up, and then the change in the speed of light over the same period, which of course is entirely flat, zero progress in the speed of light. Looking back through your biography, you wrote thinking machines and I assume that fighting latency at a much shorter distance of cabling must have been interesting in those machines because of the speeds at which they were operating.

[00:11:54]

Papadopoulos: Yes, it surprises most people when you say it, but you know, computer architects complain that the speed of light is really slow. And you know, Grace Hopper who is really one of the founders, the pioneers of modern programming languages and COBOL. I think she was a vice admiral. And she would walk around with a wire that was a foot long and say, “this is a nanosecond”. And that seemed pretty short for a while but, you know a nanosecond is an eternity these days.

[00:12:40]

Graham-Cumming: Yes, it's an eternity. People don't quite appreciate it if they're not thinking about it, how long it is. I had someone who was new to the computing world learning about it, come to me with a book which was talking about fiber optics, and in the book it said there is a laser that flashes on and off a billion times a second to send data down the fiber optic. And he came to me and said, “This can't possibly be true; it's just too fast.”

[00:13:09]

Papadopoulos: No, it's too slow!

[00:013:12]

Graham-Cumming: Right? And I thought, well that’s slow. And then I stepped back and thought, you know, to the average person, that is a ridiculous statement, that somehow we humans have managed to control time at this ridiculously small level. And then we keep pushing and pushing and pushing it and people don't appreciate how fast and actually how slow the light is, really.

[00:13:33]

Papadopoulos: Yeah. And I think if it actually comes down to it, if you want to get into a very pure reckoning of this is latency is the only thing that matters. And one can look at bandwidth as a component of latency, so you can see bandwidth as a serialization delay and that kind of goes back to Clark thing, you know that, yeah I can buy that, I can't bribe God on the other side so you know I'm fundamentally left with this problem that we have. Thank you, Albert Einstein, right? It's kind of hopeless to think about sending information faster than that.

[00:14:09]

Graham-Cumming: Yeah exactly. There’s information limits, which is driving why we have such powerful phones, because in fact the latency to the human is very low if you have it in your hand.

[00:14:23]

Papadopoulos: Yes, absolutely. This is where the edge architecture and the Worker structure that you guys are working on, and I think that's where it becomes really interesting too because it gives me — you talked about earlier, well we're now introducing this new tier — but it gives me a really closer place from a latency point of view to have some intimate relationship with a device, and at the same time be well-connected to the network.

[00:14:55]

Graham-Cumming: Right. And I think the other thing that is interesting about that is that your device fundamentally is an insecure thing, so you know if you put code on that thing, you can't put secrets in it, like a cryptographic secrets, because the end user has access to them. Normally you would keep that in the server somewhere, but then the other funny thing is if you have this intermediary tier which is both secure and low latency to the end user, you suddenly have a different world in which you can put secrets, you can put code that is privileged, but it can interact with the user very very rapidly because the low latency.

[00:15:30]

Papadopoulos: Yeah. And that essence of where’s my trust domain. Now I've seen all kinds of like, oh my gosh, I cannot believe somebody is doing it, like putting their S3 credentials, putting it down on a device and having it talk, you know, the log in for a database or something. You must be kidding. I mean that trust proxy point at low latency is a really key thing.

[00:16:02]

Graham-Cumming: Yes, I think it's just people need to start thinking about that architecture. Is there a sort of parallel with things that were going on with very high-performance computing with sort of the massively parallel stuff and what's happening today? What lessons can we take from work done in the 70s and 80s and apply it to the Internet of today?

[00:16:24]

Papadopoulos: Well, we talked about this sort of, there are a couple of fundamental issues here. And one we've been speaking about is latency. The other one is synchronization, and this comes up in a bunch of different ways. You know, whether it's when one looks at the cap theorem kinds of things that Eric Brewer has been famous for, can I get consistency and availability and survive partitionability, all that, at the same time. And so you end up in this kind of place of—goes back to maybe Einstein a bit—but you know, knowing when things have happened and when state has been actually changed or committed is a pretty profound problem.

[00:17:15]

Graham-Cumming: It is, and what order things have happened.

[00:17:18]

Papadopoulos: Yes. And that order is going to be relative to an observer here as well. And so if you're insisting on some total ordering then you're insisting on slowing things down as well. And that really is fundamental. We were pushing into that in the massively parallel stuff and you'll see that at Internet scale. You know there's another thing, if I could. This is one of my greatest “aha”s about networks and it's due to a fellow at Sun, Rob Gingell, who actually ended up being chief engineer at Sun and was one of the real pioneers of the software development framework that brought Solaris forward. But Rob would talk about this thing that I label as network entropy. It's basically what happens when you connect systems to networks, what do networks kind of do to those systems? And this is a little bit of a philosophical question; it’s not a physical one. And Rob observed that over time networks have this property of wanting to decompose things into constituent parts, have those parts get specialized and then reintegrated. And so let me make that less abstract. So in the early days of connecting systems to networks, one of the natural observations were, well why don't we take the storage out of those desktop systems or server systems and put them on the other side of at least a local network and into a file server or storage server. And so you could see that computer sort of get pulled apart between its computing and its storage piece. And then that storage piece, you know in Rob’s step, that would go on and get specialized. So we had whole companies start like Network Appliances, Pure Storage, EMC. And so, you know like big pieces of industry or look the original routers were RADb you know running on workstations and you know Cisco went and took that and made that into something and so you now see this effect happen at the next scale. One of the things that really got me excited when I first saw Cloudflare a decade ago was, wow okay in those early days, well we can take a component like a network firewall and that can get pulled away and created as its own network entity and specialized. And I think one of the things, at least from my history of Cloudflare, one of the most profound things was, particularly as you guys went in and separated off these functions early on, the fear of people was this is going to introduce latency, and in fact things got faster. Figure that.

[00:20:51]

Graham-Cumming: Part of that of course is caching and then there's dealing with the speed of light by being close to people. But also if you say your company makes things faster and you do all these different things including security, you are forced to optimize the whole thing to live up to the claim. Whereas if you try and chain things together, nobody's really responsible for that overall latency budget. It becomes natural that you have to do it.

[00:21:18]

Papadopoulos: Yes. And you all have done it brilliantly, you know, to sort of Gingell’s view. Okay so this piece got decomposed and now specialized, meaning optimized like heck, because that's what you do. And so you can see that over and over again and you see it in terms of even Twilio or something. You know, here's a messaging service. I’m just pulling my applications apart, letting people specialize. But the final piece, and this is really the punchline. The final piece is, Rob will talk about it, the value is in the reintegration of it. And so you know what are those unifying forces that are creating, if you will, the operating system for “The Network is the Computer.” You were asking about the massively parallel scale. Well, we had an operating system we wrote for this. As you get up to the higher scale, you get into these more distributed circumstances where the complexity goes up by some important number of orders of magnitude, and now what's that reintegration? And so I come back and I look at what Cloudflare is doing here. You're entering into that phase now of actually being that re-integrator, almost that operating system for the computer that is the network.

[00:23:06]

Graham-Cumming: I think that's right. We often talk about actually being an operating system on the Internet, so very similar kind of thoughts.

[00:23:14]

Papadopoulos: Yes. And you know as we were talking earlier about how developers make sense of this pendulum or cycle or whatever it is. Having this idea of an operating system or of a place where I can have ground truths and trust and sort of fixed points in this are terribly important.

[00:23:44]

Graham-Cumming: Absolutely. So do you have any final thoughts on, what, it must be 30 years on from when “The Network is the Computer” was a Sun trademark. Now it's a Cloudflare trademark. What's the future going to look of that slogan and who's going to trademark it in 30 years time now?

[00:24:03]

Papadopoulos: Well, it could be interplanetary at that point.

[00:24:13]

Graham-Cumming: Well, if you talk about the latency problems of going interplanetary, we definitely have to solve the latency.

[00:24:18]

Papadopoulos: Yeah. People do understand that. They go, wow it’s like seven minutes within here and Mars, hitting close approach.

[00:24:28]

Graham-Cumming: The earthly equivalent of that is New Zealand. If you speak to people from New Zealand and they come on holiday to Europe or they move to the US, they suddenly say that the Internet works so much better here. And it’s just that it's closer. Now the Australians have figured this out because Australia is actually drifting northwards so they're actually going to get within. That's going to fix it for them but New Zealand is stuck.

[00:24:56]

Papadopoulos: I do ask my physicist friends for one of two things. You know, either give me a faster speed of light — so far they have not delivered — or another dimension I can cut through. Maybe we'll keep working on the latter.

[00:25:16]

Graham-Cumming: All right. Well listen Greg, thank you for the conversation. Thank you for thinking about this stuff many many years ago. I think we're getting there slowly on some of this work. And yeah, good talking to you.

[00:25:27]

Papadopoulos: Well, you too. And thank you for carrying the torch forward. I think everyone from Sun who listens to this, and John, and everybody should feel really proud about what part they played in the evolution of this great invention.

[00:25:48]

Graham-Cumming: It's certainly the case that a tremendous amount of work was done at Sun that was really fundamental and, you know, perhaps some of that was ahead of its time but here we are.

[00:25:57]

Papadopoulos: Thank you.

[00:25:58]

Graham-Cumming: Thank you very much.

[00:25:59]

Papadopoulos: Cheers.


Interested in hearing more? Listen to my conversations with John Gage and Ray Rothrock of Sun Microsystems:

To learn more about Cloudflare Workers, check out the use cases below:

  • Optimizely - Optimizely chose Workers when updating their experimentation platform to provide faster responses from the edge and support more experiments for their customers.
  • Cordial - Cordial used a “stable of Workers” to do custom Black Friday load shedding as well as using it as a serverless platform for building scalable customer-facing services.
  • AO.com - AO.com used Workers to avoid significant code changes to their underlying platform when migrating from a legacy provider to a modern cloud backend.
  • Pwned Passwords - Troy Hunt’s popular "Have I Been Pwned" project benefits from cache hit ratios of 94% on its Pwned Passwords API due to Workers.
  • Timely - Using Workers and Workers KV, Timely was able to safely migrate application endpoints using simple value updates to a distributed key-value store.
  • Quintype - Quintype was an eager adopter of Workers to cache content they previously considered un-cacheable and improve the user experience of their publishing platform.

07:00

The Network is the Computer [The Cloudflare Blog]

The Network is the Computer
The Network is the Computer

Get straight to the interviews:


We recently registered the trademark for The Network is the Computer®, to encompass how Cloudflare is utilizing its network to pave the way for the future of the Internet.

The phrase was first coined in 1984 by John Gage, the 21st employee of Sun Microsystems, where he was credited with building Sun’s vision around “The Network is the Computer.” When Sun was acquired in 2010, the trademark was not renewed, but the vision remained.

Take it from him:

“When we built Sun Microsystems, every computer we made had the network at its core. But we could only imagine, over thirty years ago, today’s billions of networked devices, from the smallest camera or light bulb to the largest supercomputer, sharing their packets across Cloudflare’s distributed global network.
We based our vision of an interconnected world on open and shared standards. Cloudflare extends this dedication to new levels by openly sharing designs for security and resilience in the post-quantum computer world.
Most importantly, Cloudflare is committed to immediate, open, transparent accountability for network performance. I’m a dedicated reader of their technical blog, as the network becomes central to our security infrastructure and the global economy, demanding even more powerful technical innovation.”

Cloudflare's massive network, which spans more than 180 cities in 80 countries, enables the company to deliver its suite of security, performance, and reliability products, including its serverless edge computing offerings.

In March of 2018, we launched our serverless solution Cloudflare Workers, to allow anyone to deploy code at the edge of our network. We also recently announced advancements to Cloudflare Workers in June of 2019 to give application developers the ability to do away with cloud regions, VMs, servers, containers, load balancers—all they need to do is write the code, and we do the rest. With each of Cloudflare’s data centers acting as a highly scalable application origin to which users are automatically routed via our Anycast network, code is run within milliseconds of users worldwide.

In honor of registering Sun’s former trademark, I spoke with John Gage, Greg Papadopoulos, former CTO of Sun Microsystems, and Ray Rothrock, former Director of CAD/CAM Marketing at Sun Microsystems, to learn more about the history of the phrase and what it means for the future:

To learn more about Cloudflare Workers, check out the use cases below:

  • Optimizely - Optimizely chose Workers when updating their experimentation platform to provide faster responses from the edge and support more experiments for their customers.
  • Cordial - Cordial used a “stable of Workers” to do custom Black Friday load shedding as well as using it as a serverless platform for building scalable customer-facing services.
  • AO.com - AO.com used Workers to avoid significant code changes to their underlying platform when migrating from a legacy provider to a modern cloud backend.
  • Pwned Passwords - Troy Hunt’s popular "Have I Been Pwned" project benefits from cache hit ratios of 94% on its Pwned Passwords API due to Workers.
  • Timely - Using Workers and Workers KV, Timely was able to safely migrate application endpoints using simple value updates to a distributed key-value store.
  • Quintype - Quintype was an eager adopter of Workers to cache content they previously considered un-cacheable and improve the user experience of their publishing platform.

Wednesday, 10 July

08:50

Fedora job opening: Fedora Community Action and Impact Coordinator (FCAIC) [Fedora Magazine]

I’ve decided to move on from my role as the Fedora Community Action and Impact Coordinator (FCAIC).  This was not an easy decision to make. I am proud of the work I have done in Fedora over the last three years and I think I have helped the community move past many challenges.  I could NEVER have done all of this without the support and assistance of the community!

As some of you know, I have been covering for some other roles in Red Hat for almost the last year.  Some of these tasks have led to some opportunities to take my career in a different direction. I am going to remain at Red Hat and on the same team with the same manager, but with a slightly expanded scope of duties.  I will no longer be day-to-day on Fedora and will instead be in a consultative role as a Community Architect at Large. This is a fancy way of saying that I will be tackling helping lots of projects with various issues while also working on some specific strategic objectives.

I think this is a great opportunity for the Fedora community.  The Fedora I became FCAIC in three years ago is a very different place from the Fedora of today.  While I could easily continue to help shape and grow this community, I think that I can do more by letting some new ideas come in.  The new person will hopefully be able to approach challenges differently. I’ll also be here to offer my advice and feedback as others who have moved on in the past have done.  Additionally, I will work with Matthew Miller and Red Hat to help hire and onboard the new Fedora Community and Impact Coordinator. During this time I will continue as FCAIC.

This means that we are looking for a new FCAIC. Love Fedora? Want to work with Fedora full-time to help support and grow the Fedora community? This is the core of what the FCAIC does. The job description (also below), has a list of some of the primary job responsibilities and required skills – but that’s just a sample of the duties required, and the day to day life working full-time with the Fedora community.

Day to day work includes working with Mindshare, managing the Fedora Budget, and being part of many other teams, including the Fedora Council.  You should be ready to write frequently about Fedora’s achievements, policies and decisions, and to draft and generate ideas and strategies. And, of course, planning Flock and Fedora’s presence at other events. It’s hard work, but also a great deal of fun.

Are you good at setting long-term priorities and hacking away at problems with the big picture in mind? Do you enjoy working with people all around the world, with a variety of skills and interests, to build not just a successful Linux distribution, but a healthy project? Can you set priorities, follow through, and know when to say “no” in order to focus on the most important tasks for success? Is Fedora’s mission deeply important to you?

If you said “yes” to those questions, you might be a great candidate for the FCAIC role. If you think you’re a great fit apply online, or contact Matthew Miller, Brian Exelbierd, or Stormy Peters.



Fedora Community Action and Impact Coordinator

Location: CZ-Remote – prefer Europe but can be North America

Company Description

At Red Hat, we connect an innovative community of customers, partners, and contributors to deliver an open source stack of trusted, high-performing solutions. We offer cloud, Linux, middleware, storage, and virtualization technologies, together with award-winning global customer support, consulting, and implementation services. Red Hat is a rapidly growing company supporting more than 90% of Fortune 500 companies.

Job summary

Red Hat’s Open Source Programs Office (OSPO) team is looking for the next Fedora Community Action and Impact Lead. In this role, you will join the Fedora Council and guide initiatives to grow the Fedora user and developer communities, as well as make Red Hat and Fedora interactions even more transparent and positive. The Council is responsible for stewardship of the Fedora Project as a whole, and supports the health and growth of the Fedora community.

As a the Fedora Community Action and Impact Lead, you’ll facilitate decision making on how to best focus the Fedora community budget to meet our collective objectives, work with other council members to identify the short, medium, and long-term goals of the Fedora community, and organize and enable the project.

You will also help make decisions about trademark use, project structure, community disputes or complaints, and other issues. You’ll hold a full council membership, not an auxiliary or advisory role.

Primary job responsibilities

  • Identify opportunities to engage new contributors and community members; align project around supporting those opportunities.
  • Improve on-boarding materials and processes for new contributors.
  • Participate in user and developer discussions and identify barriers to success for contributors and users.
  • Use metrics to evaluate the success of open source initiatives.
  • Regularly report on community metrics and developments, both internally and externally.  
  • Represent Red Hat’s stake in the Fedora community’s success.
  • Work with internal stakeholders to understand their goals and develop strategies for working effectively with the community.
  • Improve onboarding materials and presentation of Fedora to new hires; develop standardized materials on Fedora that can be used globally at Red Hat.
  • Work with the Fedora Council to determine the annual Fedora budget.
  • Assist in planning and organizing Fedora’s flagship events each year.
  • Create and carry out community promotion strategies; create media content like blog posts, podcasts, and videos and facilitate the creation of media by other members of the community

Required skills

  • Extensive experience with the Fedora Project or a comparable open source community.
  • Exceptional writing and speaking skills
  • Experience with software development and open source developer communities; understanding of development processes.
  • Outstanding organizational skills; ability to prioritize tasks matching short and long-term goals and focus on the tasks of high priority
  • Ability to manage a project budget.
  • Ability to lead teams and participate in multiple cross-organizational teams that span the globe.
  • Experience motivating volunteers and staff across departments and companies

Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.

Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.


Photo by Deva Williamson on Unsplash.

07:33

Saturday Morning Breakfast Cereal - Odie [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I've never quite got the caption longer than the picture, but if I keep pushing it I might eventually write a novel by accident.


Today's News:

07:07

A gentle introduction to Linux Kernel fuzzing [The Cloudflare Blog]

A gentle introduction to Linux Kernel fuzzing

For some time I’ve wanted to play with coverage-guided fuzzing. Fuzzing is a powerful testing technique where an automated program feeds semi-random inputs to a tested program. The intention is to find such inputs that trigger bugs. Fuzzing is especially useful in finding memory corruption bugs in C or C++ programs.

A gentle introduction to Linux Kernel fuzzing

Image by Patrick Shannon CC BY 2.0

Normally it's recommended to pick a well known, but little explored, library that is heavy on parsing. Historically things like libjpeg, libpng and libyaml were perfect targets. Nowadays it's harder to find a good target - everything seems to have been fuzzed to death already. That's a good thing! I guess the software is getting better! Instead of choosing a userspace target I decided to have a go at the Linux Kernel netlink machinery.

Netlink is an internal Linux facility used by tools like "ss", "ip", "netstat". It's used for low level networking tasks - configuring network interfaces, IP addresses, routing tables and such. It's a good target: it's an obscure part of kernel, and it's relatively easy to automatically craft valid messages. Most importantly, we can learn a lot about Linux internals in the process. Bugs in netlink aren't going to have security impact though - netlink sockets usually require privileged access anyway.

In this post we'll run AFL fuzzer, driving our netlink shim program against a custom Linux kernel. All of this running inside KVM virtualization.

This blog post is a tutorial. With the easy to follow instructions, you should be able to quickly replicate the results. All you need is a machine running Linux and 20 minutes.

Prior work

The technique we are going to use is formally called "coverage-guided fuzzing". There's a lot of prior literature:

Many people have fuzzed the Linux Kernel in the past. Most importantly:

  • syzkaller (aka syzbot) by Dmitry Vyukov, is a very powerful CI-style continuously running kernel fuzzer, which found hundreds of issues already. It's an awesome machine - it will even report the bugs automatically!
  • Trinity fuzzer

We'll use the AFL, everyone's favorite fuzzer. AFL was written by Michał Zalewski. It's well known for its ease of use, speed and very good mutation logic. It's a perfect choice for people starting their journey into fuzzing!

If you want to read more about AFL, the documentation is in couple of files:

Coverage-guided fuzzing

Coverage-guided fuzzing works on the principle of a feedback loop:

  • the fuzzer picks the most promising test case
  • the fuzzer mutates the test into a large number of new test cases
  • the target code runs the mutated test cases, and reports back code coverage
  • the fuzzer computes a score from the reported coverage, and uses it to prioritize the interesting mutated tests and remove the redundant ones

For example, let's say the input test is "hello". Fuzzer may mutate it to a number of tests, for example: "hEllo" (bit flip), "hXello" (byte insertion), "hllo" (byte deletion). If any of these tests will yield an interesting code coverage, then it will be prioritized and used as a base for a next generation of tests.

Specifics on how mutations are done, and how to efficiently compare code coverage reports of thousands of program runs is the fuzzer secret sauce. Read on the AFL's technical whitepaper for nitty gritty details.

The code coverage reported back from the binary is very important. It allows fuzzer to order the test cases, and identify the most promising ones. Without the code coverage the fuzzer is blind.

Normally, when using AFL, we are required to instrument the target code so that coverage is reported in an AFL-compatible way. But we want to fuzz the kernel! We can't just recompile it with "afl-gcc"! Instead we'll use a trick. We'll prepare a binary that will trick AFL into thinking it was compiled with its tooling. This binary will report back the code coverage extracted from kernel.

Kernel code coverage

The kernel has at least two built-in coverage mechanisms - GCOV and KCOV:

KCOV was designed with fuzzing in mind, so we'll use this.

Using KCOV is pretty easy. We must compile the Linux kernel with the right setting. First, enable the KCOV kernel config option:

cd linux
./scripts/config \
    -e KCOV \
    -d KCOV_INSTRUMENT_ALL

KCOV is capable of recording code coverage from the whole kernel. It can be set with KCOV_INSTRUMENT_ALL option. This has disadvantages though - it would slow down the parts of the kernel we don't want to profile, and would introduce noise in our measurements (reduce "stability"). For starters, let's disable KCOV_INSTRUMENT_ALL and enable KCOV selectively on the code we actually want to profile. Today, we focus on netlink machinery, so let's enable KCOV on whole "net" directory tree:

find net -name Makefile | xargs -L1 -I {} bash -c 'echo "KCOV_INSTRUMENT := y" >> {}'

In a perfect world we would enable KCOV only for a couple of files we really are interested in. But netlink handling is peppered all over the network stack code, and we don't have time for fine tuning it today.

With KCOV in place, it's worth to add "kernel hacking" toggles that will increase the likelihood of reporting memory corruption bugs. See the README for the list of Syzkaller suggested options - most importantly KASAN.

With that set we can compile our KCOV and KASAN enabled kernel. Oh, one more thing. We are going to run the kernel in a kvm. We're going to use "virtme", so we need a couple of toggles:

./scripts/config \
    -e VIRTIO -e VIRTIO_PCI -e NET_9P -e NET_9P_VIRTIO -e 9P_FS \
    -e VIRTIO_NET -e VIRTIO_CONSOLE  -e DEVTMPFS ...

(see the README for full list)

How to use KCOV

KCOV is super easy to use. First, note the code coverage is recorded in a per-process data structure. This means you have to enable and disable KCOV within a userspace process, and it's impossible to record coverage for non-task things, like interrupt handling. This is totally fine for our needs.

KCOV reports data into a ring buffer. Setting it up is pretty simple, see our code. Then you can enable and disable it with a trivial ioctl:

ioctl(kcov_fd, KCOV_ENABLE, KCOV_TRACE_PC);
/* profiled code */
ioctl(kcov_fd, KCOV_DISABLE, 0);

After this sequence the ring buffer contains the list of %rip values of all the basic blocks of the KCOV-enabled kernel code. To read the buffer just run:

n = __atomic_load_n(&kcov_ring[0], __ATOMIC_RELAXED);
for (i = 0; i < n; i++) {
    printf("0x%lx\n", kcov_ring[i + 1]);
}

With tools like addr2line it's possible to resolve the %rip to a specific line of code. We won't need it though - the raw %rip values are sufficient for us.

Feeding KCOV into AFL

The next step in our journey is to learn how to trick AFL. Remember, AFL needs a specially-crafted executable, but we want to feed in the kernel code coverage. First we need to understand how AFL works.

AFL sets up an array of 64K 8-bit numbers. This memory region is called "shared_mem" or "trace_bits" and is shared with the traced program. Every byte in the array can be thought of as a hit counter for a particular (branch_src, branch_dst) pair in the instrumented code.

It's important to notice that AFL prefers random branch labels, rather than reusing the %rip value to identify the basic blocks. This is to increase entropy - we want our hit counters in the array to be uniformly distributed. The algorithm AFL uses is:

cur_location = <COMPILE_TIME_RANDOM>;
shared_mem[cur_location ^ prev_location]++; 
prev_location = cur_location >> 1;

In our case with KCOV we don't have compile-time-random values for each branch. Instead we'll use a hash function to generate a uniform 16 bit number from %rip recorded by KCOV. This is how to feed a KCOV report into the AFL "shared_mem" array:

n = __atomic_load_n(&kcov_ring[0], __ATOMIC_RELAXED);
uint16_t prev_location = 0;
for (i = 0; i < n; i++) {
        uint16_t cur_location = hash_function(kcov_ring[i + 1]);
        shared_mem[cur_location ^ prev_location]++;
        prev_location = cur_location >> 1;
}

Reading test data from AFL

Finally, we need to actually write the test code hammering the kernel netlink interface! First we need to read input data from AFL. By default AFL sends a test case to stdin:

/* read AFL test data */
char buf[512*1024];
int buf_len = read(0, buf, sizeof(buf));

Then we need to send this buffer into a netlink socket. But we know nothing about how netlink works! Okay, let's use the first 5 bytes of input as the netlink protocol and group id fields. This will allow the AFL to figure out and guess the correct values of these fields. The code testing netlink (simplified):

netlink_fd = socket(AF_NETLINK, SOCK_RAW | SOCK_NONBLOCK, buf[0]);

struct sockaddr_nl sa = {
        .nl_family = AF_NETLINK,
        .nl_groups = (buf[1] <<24) | (buf[2]<<16) | (buf[3]<<8) | buf[4],
};

bind(netlink_fd, (struct sockaddr *) &sa, sizeof(sa));

struct iovec iov = { &buf[5], buf_len - 5 };
struct sockaddr_nl sax = {
      .nl_family = AF_NETLINK,
};

struct msghdr msg = { &sax, sizeof(sax), &iov, 1, NULL, 0, 0 };
r = sendmsg(netlink_fd, &msg, 0);
if (r != -1) {
      /* sendmsg succeeded! great I guess... */
}

That's basically it! For speed, we will wrap this in a short loop that mimics the AFL "fork server" logic. I'll skip the explanation here, see our code for details. The resulting code of our AFL-to-KCOV shim looks like:

forksrv_welcome();
while(1) {
    forksrv_cycle();
    test_data = afl_read_input();
    kcov_enable();
    /* netlink magic */
    kcov_disable();
    /* fill in shared_map with tuples recorded by kcov */
    if (new_crash_in_dmesg) {
         forksrv_status(1);
    } else {
         forksrv_status(0);
    }
}

See full source code.

How to run the custom kernel

We're missing one important piece - how to actually run the custom kernel we've built. There are three options:

"native": You can totally boot the built kernel on your server and fuzz it natively. This is the fastest technique, but pretty problematic. If the fuzzing succeeds in finding a bug you will crash the machine, potentially losing the test data. Cutting the branches we sit on should be avoided.

"uml": We could configure the kernel to run as User Mode Linux. Running a UML kernel requires no privileges. The kernel just runs a user space process. UML is pretty cool, but sadly, it doesn't support KASAN, therefore the chances of finding a memory corruption bug are reduced. Finally, UML is a pretty magic special environment - bugs found in UML may not be relevant on real environments. Interestingly, UML is used by Android network_tests framework.

"kvm": we can use kvm to run our custom kernel in a virtualized environment. This is what we'll do.

One of the simplest ways to run a custom kernel in a KVM environment is to use "virtme" scripts. With them we can avoid having to create a dedicated disk image or partition, and just share the host file system. This is how we can run our code:

virtme-run \
    --kimg bzImage \
    --rw --pwd --memory 512M \
    --script-sh "<what to run inside kvm>" 

But hold on. We forgot about preparing input corpus data for our fuzzer!

Building the input corpus

Every fuzzer takes a carefully crafted test cases as input, to bootstrap the first mutations. The test cases should be short, and cover as large part of code as possible. Sadly - I know nothing about netlink. How about we don't prepare the input corpus...

Instead we can ask AFL to "figure out" what inputs make sense. This is what Michał did back in 2014 with JPEGs and it worked for him. With this in mind, here is our input corpus:

mkdir inp
echo "hello world" > inp/01.txt

Instructions, how to compile and run the whole thing are in README.md on our github. It boils down to:

virtme-run \
    --kimg bzImage \
    --rw --pwd --memory 512M \
    --script-sh "./afl-fuzz -i inp -o out -- fuzznetlink" 

With this running you will see the familiar AFL status screen:

A gentle introduction to Linux Kernel fuzzing

Further notes

That's it. Now you have a custom hardened kernel, running a basic coverage-guided fuzzer. All inside KVM.

Was it worth the effort? Even with this basic fuzzer, and no input corpus, after a day or two the fuzzer found an interesting code path: NEIGH: BUG, double timer add, state is 8. With a more specialized fuzzer, some work on improving the "stability" metric and a decent input corpus, we could expect even better results.

If you want to learn more about what netlink sockets actually do, see a blog post by my colleague Jakub Sitnicki Multipath Routing in Linux - part 1. Then there is a good chapter about it in Linux Kernel Networking book by Rami Rosen.

In this blog post we haven't mentioned:

  • details of AFL shared_memory setup
  • implementation of AFL persistent mode
  • how to create a network namespace to isolate the effects of weird netlink commands, and improve the "stability" AFL score
  • technique on how to read dmesg (/dev/kmsg) to find kernel crashes
  • idea to run AFL outside of KVM, for speed and stability - currently the tests aren't stable after a crash is found

But we achieved our goal - we set up a basic, yet still useful fuzzer against a kernel. Most importantly: the same machinery can be reused to fuzz other parts of Linux subsystems - from file systems to bpf verifier.

I also learned a hard lesson: tuning fuzzers is a full time job. Proper fuzzing is definitely not as simple as starting it up and idly waiting for crashes. There is always something to improve, tune, and re-implement. A quote at the beginning of the mentioned presentation by Mateusz Jurczyk resonated with me:

"Fuzzing is easy to learn but hard to master."

Happy bug hunting!

Tuesday, 09 July

06:51

Red Hat, IBM, and Fedora [Fedora Magazine]

Today marks a new day in the 26-year history of Red Hat. IBM has finalized its acquisition of Red Hat, which will operate as a distinct unit within IBM.

What does this mean for Red Hat’s participation in the Fedora Project?

In short, nothing.

Red Hat will continue to be a champion for open source, just as it always has, and valued projects like Fedora that will continue to play a role in driving innovation in open source technology. IBM is committed to Red Hat’s independence and role in open source software communities. We will continue this work and, as always, we will continue to help upstream projects be successful and contribute to welcoming new members and maintaining the project.

In Fedora, our mission, governance, and objectives remain the same. Red Hat associates will continue to contribute to the upstream in the same ways they have been.

We will do this together, with the community, as we always have.

If you have questions or would like to learn more about today’s news, I encourage you to review the materials below. For any questions not answered here, please feel free to contact us. Red Hat CTO Chris Wright will host an online Q&A session in the coming days where you can ask questions you may have about what the acquisition means for Red Hat and our involvement in open source communities. Details will be announced on the Red Hat blog.

Regards,

Matthew Miller, Fedora Project Leader
Brian Exelbierd, Fedora Community Action and Impact Coordinator

04:49

Saturday Morning Breakfast Cereal - Psycho [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
That third-to-last panel is going to be the title of my autobiography one day.


Today's News:

Monday, 08 July

05:42

Saturday Morning Breakfast Cereal - Breakup [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Now that America has won the World Cup, it's officially called Soccer until someone beats us.


Today's News:

02:00

Command line quick tips: Permissions [Fedora Magazine]

Fedora, like all Linux based systems, comes with a powerful set of security features. One of the basic features is permissions on files and folders. These permissions allow files and folders to be secured from unauthorized access. This article explains a bit about these permissions, and shows you how to share access to a folder using them.

Permission basics

Fedora is by nature a multi-user operating system. It also has groups, which users can be members of. But imagine for a moment a multi-user system with no concept of permissions. Different logged in users could read each other’s content at will. This isn’t very good for privacy or security, as you can imagine.

Any file or folder on Fedora has three sets of permissions assigned. The first set is for the user who owns the file or folder. The second is for the group that owns it. The third set is for everyone else who’s not the user who owns the file, or in the group that owns the file. Sometimes this is called the world.

What permissions mean

Each set of permissions comes in three flavors — read, write, and execute. Each of these has an initial that stands for the permission, thus r, w, and x.

File permissions

For files, here’s what these permissions mean:

  • Read (r): the file content can be read
  • Write (w): the file content can be changed
  • Execute (x): the file can be executed — this is used primarily for programs or scripts that are meant to be run directly

You can see the three sets of these permissions when you do a long listing of any file. Try this with the /etc/services file on your system:

$ ls -l /etc/services
-rw-r--r--. 1 root root 692241 Apr 9 03:47 /etc/services

Notice the groups of permissions at the left side of the listing. These are provided in three sets, as mentioned above — for the user who owns the file, for the group that owns the file, and for everyone else. The user owner is root and the group owner is the root group. The user owner has read and write access to the file. Anyone in the group root can only read the file. And finally, anyone else can also only read the file. (The dash at the far left shows this is a regular file.)

By the way, you’ll commonly find this set of permissions on many (but not all) system configuration files. They are only meant to be changed by the system administrator, not regular users. Often regular users need to read the content as well.

Folder (directory) permissions

For folders, the permissions have slightly different meaning:

  • Read (r): the folder contents can be read (such as the ls command)
  • Write (w): the folder contents can be changed (files can be created or erased in this folder)
  • Execute (x): the folder can be searched, although its contents cannot be read. (This may sound strange, but the explanation requires more complex details of file systems outside the scope of this article. So just roll with it for now.)

Take a look at the /etc/grub.d folder for example:

$ ls -ld /etc/grub.d
drwx------. 2 root root 4096 May 23 16:28 /etc/grub.d

Note the d at the far left. It shows this is a directory, or folder. The permissions show the user owner (root) can read, change, and cd into this folder. However, no one else can do so — whether they’re a member of the root group or not. Notice you can’t cd into the folder, either:

$ cd /etc/grub.d
bash: cd: /etc/grub.d: Permission denied

Notice how your own home directory is setup:

$ ls -ld $HOME
drwx------. 221 paul paul 28672 Jul 3 14:03 /home/paul

Now, notice how no one, other than you as the owner, can access anything in this folder. This is intentional! You wouldn’t want others to be able to read your private content on a shared system.

Making a shared folder

You can exploit this permissions capability to easily make a folder to share within a group. Imagine you have a group called finance with several members who need to share documents. Because these are user documents, it’s a good idea to store them within the /home folder hierarchy.

To get started, use sudo to make a folder for sharing, and set it to be owned by the finance group:

$ sudo mkdir -p /home/shared/finance
$ sudo chgrp finance /home/shared/finance

By default the new folder has these permissions. Notice how it can be read or searched by anyone, even if they can’t create or erase files in it:

drwxr-xr-x. 2 root finance 4096 Jul  6 15:35 finance

That doesn’t seem like a good idea for financial data. Next, use the chmod command to change the mode (permissions) of the shared folder. Note the use of g to change the owning group’s permissions, and o to change other users’ permissions. Similarly, u would change the user owner’s permissions:

$ sudo chmod g+w,o-rx /home/shared/finance

The resulting permissions look better. Now, anyone in the finance group (or the user owner root) have total access to the folder and its contents:

drwxrwx---. 2 root finance 4096 Jul  6 15:35 finance

If any other user tries to access the shared folder, they won’t be able to do so. Great! Now our finance group can put documents in a shared place.

Other notes

There are additional ways to manipulate these permissions. For example, you may want any files in this folder to be set as owned by the group finance. This requires additional settings not covered in this article, but stay tuned to the Magazine for more on that topic soon.

Sunday, 07 July

05:23

Saturday Morning Breakfast Cereal - Morale [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Later, the bird loses his job, but gets a bonus of ten million birdsseds.


Today's News:

Saturday, 06 July

04:45

Saturday Morning Breakfast Cereal - Recursion [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
If you are experiencing infinite loop problems, please return to the beginning of this sentence.


Today's News:

Friday, 05 July

13:34

Saturday Morning Breakfast Cereal - Die Kudzu [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
We will, however, be keeping the invasive cherry trees.


Today's News:

01:00

Manage your shell environment [Fedora Magazine]

Some time ago, the Fedora Magazine has published an article introducing ZSH — an alternative shell to Fedora’s default, bash. This time, we’re going to look into customizing it to use it in a more effective way. All of the concepts shown in this article also work in other shells such as bash.

Alias

Aliases are shortcuts for commands. This is useful for creating short commands for actions that are performed often, but require a long command that would take too much time to type. The syntax is:

$ alias yourAlias='complex command with arguments'

They don’t always need to be used for shortening long commands. Important is that you use them for tasks that you do often. An example could be:

$ alias dnfUpgrade='dnf -y upgrade'

That way, to do a system upgrade, I just type dnfUpgrade instead of the whole dnf command.

The problem of setting aliases right in the console is that once the terminal session is closed, the alias would be lost. To set them permanently, resource files are used.

Resource Files

Resource files (or rc files) are configuration files that are loaded per user in the beginning of a session or a process (when a new terminal window is opened, or a new program like vim is started). In the case of ZSH, the resource file is .zshrc, and for bash it’s .bashrc.

To make the aliases permanent, you can either put them in your resource. You can edit your resource file with a text editor of your choice. This example uses vim:

$ vim $HOME/.zshrc

Or for bash:

$ vim $HOME/.bashrc

Note that the location of the resource file is specified relatively to a home directory — and that’s where ZSH (or bash) are going to look for the file by default for each user.

Other option is to put your configuration in any other file, and then source it:

$ source /path/to/your/rc/file

Again, sourcing it right in your session will only apply it to the session, so to make it permanent, add the source command to your resource file. The advantage of having your source file in a different location is that you can source it any time. Or anywhere which is especially useful in shared environments.

Environment Variables

Environment variables are values assigned to a specific name which can be then called in scripts and commands. They start with the $ dollar sign. One of the most common is $HOME that references the home directory.

As the name suggests, environment variables are a part of your environment. Set a variable using the following syntax:

$ http_proxy="http://your.proxy"

And to make it an environment variable, export it with the following command:

$ export http_proxy

To see all the environment variables that are currently set, use the env command:

$ env

The command outputs all the variables available in your session. To demonstrate how to use them in a command, try running the following echo commands:

$ echo $PWD
/home/fedora
$ echo $USER
fedora

What happens here is variable expansion — the value stored in the variable is used in your command.

Another useful variable is $PATH, that defines directories that your shell uses to look for binaries.

The $PATH variable

There are many directories, or folders (the way they are called in graphical environments) that are important to the OS. Some directories are set to hold binaries you can use directly in your shell. And these directories are defined in the $PATH variable.

$ echo $PATH
/usr/lib64/qt-3.3/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/usr/libexec/sdcc:/usr/libexec/sdcc:/usr/bin:/bin:/sbin:/usr/sbin:/opt/FortiClient

This will help you when you want to have your own binaries (or scripts) accessible in the shell.

Thursday, 04 July

05:57

Saturday Morning Breakfast Cereal - Dance [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
There needs to be a line of throw pillows that say 'Murder like nobody's looking.'


Today's News:

Wednesday, 03 July

06:03

Tuesday, 02 July

09:50

Cloudflare outage caused by bad software deploy (updated) [The Cloudflare Blog]

This is a short placeholder blog and will be replaced with a full post-mortem and disclosure of what happened today.

For about 30 minutes today, visitors to Cloudflare sites received 502 errors caused by a massive spike in CPU utilization on our network. This CPU spike was caused by a bad software deploy that was rolled back. Once rolled back the service returned to normal operation and all domains using Cloudflare returned to normal traffic levels.

This was not an attack (as some have speculated) and we are incredibly sorry that this incident occurred. Internal teams are meeting as I write performing a full post-mortem to understand how this occurred and how we prevent this from ever occurring again.


Update at 2009 UTC:

Starting at 1342 UTC today we experienced a global outage across our network that resulted in visitors to Cloudflare-proxied domains being shown 502 errors (“Bad Gateway”). The cause of this outage was deployment of a single misconfigured rule within the Cloudflare Web Application Firewall (WAF) during a routine deployment of new Cloudflare WAF Managed rules.

The intent of these new rules was to improve the blocking of inline JavaScript that is used in attacks. These rules were being deployed in a simulated mode where issues are identified and logged by the new rule but no customer traffic is actually blocked so that we can measure false positive rates and ensure that the new rules do not cause problems when they are deployed into full production.

Unfortunately, one of these rules contained a regular expression that caused CPU to spike to 100% on our machines worldwide. This 100% CPU spike caused the 502 errors that our customers saw. At its worst traffic dropped by 82%.

This chart shows percentage of traffic lost in one of our PoPs:

We were seeing an unprecedented CPU exhaustion event, which was novel for us as we had not experienced global CPU exhaustion before.

We make software deployments constantly across the network and have automated systems to run test suites and a procedure for deploying progressively to prevent incidents. Unfortunately, these WAF rules were deployed globally in one go and caused today’s outage.

At 1402 UTC we understood what was happening and decided to issue a ‘global kill’ on the WAF Managed Rulesets, which instantly dropped CPU back to normal and restored traffic. That occurred at 1409 UTC.

We then went on to review the offending pull request, roll back the specific rules, test the change to ensure that we were 100% certain that we had the correct fix, and re-enabled the WAF Managed Rulesets at 1452 UTC.

We recognize that an incident like this is very painful for our customers. Our testing processes were insufficient in this case and we are reviewing and making changes to our testing and deployment process to avoid incidents like this in the future.


09:50

Cloudflare outage caused by bad software deploy (updated) [The Cloudflare Blog]

This is a short placeholder blog and will be replaced with a full post-mortem and disclosure of what happened today.

For about 30 minutes today, visitors to Cloudflare sites received 502 errors caused by a massive spike in CPU utilization on our network. This CPU spike was caused by a bad software deploy that was rolled back. Once rolled back the service returned to normal operation and all domains using Cloudflare returned to normal traffic levels.

This was not an attack (as some have speculated) and we are incredibly sorry that this incident occurred. Internal teams are meeting as I write performing a full post-mortem to understand how this occurred and how we prevent this from ever occurring again.


Update at 2009 UTC:

Starting at 1342 UTC today we experienced a global outage across our network that resulted in visitors to Cloudflare-proxied domains being shown 502 errors (“Bad Gateway”). The cause of this outage was deployment of a single misconfigured rule within the Cloudflare Web Application Firewall (WAF) during a routine deployment of new Cloudflare WAF Managed rules.

The intent of these new rules was to improve the blocking of inline JavaScript that is used in attacks. These rules were being deployed in a simulated mode where issues are identified and logged by the new rule but no customer traffic is actually blocked so that we can measure false positive rates and ensure that the new rules do not cause problems when they are deployed into full production.

Unfortunately, one of these rules contained a regular expression that caused CPU to spike to 100% on our machines worldwide. This 100% CPU spike caused the 502 errors that our customers saw. At its worst traffic dropped by 82%.

This chart shows CPU spiking in one of our PoPs:

We were seeing an unprecedented CPU exhaustion event, which was novel for us as we had not experienced global CPU exhaustion before.

We make software deployments constantly across the network and have automated systems to run test suites and a procedure for deploying progressively to prevent incidents. Unfortunately, these WAF rules were deployed globally in one go and caused today’s outage.

At 1402 UTC we understood what was happening and decided to issue a ‘global kill’ on the WAF Managed Rulesets, which instantly dropped CPU back to normal and restored traffic. That occurred at 1409 UTC.

We then went on to review the offending pull request, roll back the specific rules, test the change to ensure that we were 100% certain that we had the correct fix, and re-enabled the WAF Managed Rulesets at 1452 UTC.

We recognize that an incident like this is very painful for our customers. Our testing processes were insufficient in this case and we are reviewing and making changes to our testing and deployment process to avoid incidents like this in the future.

05:34

Saturday Morning Breakfast Cereal - Evolution [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
As I post this, I realize it's perilously close to Katherine Read's excellent talk from the last BAHFest London. So, go watch it!


Today's News:

02:00

Jupyter and data science in Fedora [Fedora Magazine]

In the past, kings and leaders used oracles and magicians to help them predict the future — or at least get some good advice due to their supposed power to perceive hidden information. Nowadays, we live in a society obsessed with quantifying everything. So we have data scientists to do this job.

Data scientists use statistical models, numerical techniques and advanced algorithms that didn’t come from statistical disciplines, along with the data that exist on databases, to find, to infer, to predict data that doesn’t exist yet. Sometimes this data is about the future. That is why we do a lot of predictive analytics and prescriptive analytics.

Here are some questions to which data scientists help find answers:

  1. Who are the students with high propensity to abandon the class? For each one, what are the reasons for leaving?
  2. Which house has a price above or below the fair price? What is the fair price for a certain house?
  3. What are the hidden groups that my clients classify themselves?
  4. Which future problems this premature child will develop?
  5. How many calls will I get in my call center tomorrow 11:43 AM?
  6. My bank should or should not lend money to this customer?

Note how the answer to all these question is not sitting in any database waiting to be queried. These are all data that still doesn’t exist and has to be calculated. That is part of the job we data scientists do.

Throughout this article you’ll learn how to prepare a Fedora system as a Data Scientist’s development environment and also a production system. Most of the basic software is RPM-packaged, but the most advanced parts can only be installed, nowadays, with Python’s pip tool.

Jupyter — the IDE

Most modern data scientists use Python. And an important part of their work is EDA (exploratory data analysis). EDA is a manual and interactive process that retrieves data, explores its features, searches for correlations, and uses plotted graphics to visualize and understand how data is shaped and prototypes predictive models.

Jupyter is a web application perfect for this task. Jupyter works with Notebooks, documents that mix rich text including beautifully rendered math formulas (thanks to mathjax), blocks of code and code output, including graphics.

Notebook files have extension .ipynb, which means Interactive Python Notebook.

Setting up and running Jupyter

First, install essential packages for Jupyter (using sudo):

$ sudo dnf install python3-notebook mathjax sscg

You might want to install additional and optional Python modules commonly used by data scientists:

$ sudo dnf install python3-seaborn python3-lxml python3-basemap python3-scikit-image python3-scikit-learn python3-sympy python3-dask+dataframe python3-nltk

Set a password to log into Notebook web interface and avoid those long tokens. Run the following command anywhere on your terminal:

$ mkdir -p $HOME/.jupyter
$ jupyter notebook password

Now, type a password for yourself. This will create the file $HOME/.jupyter/jupyter_notebook_config.json with your encrypted password.

Next, prepare for SSLby generating a self-signed HTTPS certificate for Jupyter’s web server:

$ cd $HOME/.jupyter; sscg

Finish configuring Jupyter by editing your $HOME/.jupyter/jupyter_notebook_config.json file. Make it look like this:

{
"NotebookApp": {
"password": "sha1:abf58...87b",
"ip": "*",
"allow_origin": "*",
"allow_remote_access": true,
"open_browser": false,
"websocket_compression_options": {},
"certfile": "/home/aviram/.jupyter/service.pem",
"keyfile": "/home/aviram/.jupyter/service-key.pem",
"notebook_dir": "/home/aviram/Notebooks"
}
}

The parts in red must be changed to match your folders. Parts in blue were already there after you created your password. Parts in green are the crypto-related files generated by sscg.

Create a folder for your notebook files, as configured in the notebook_dir setting above:

$ mkdir $HOME/Notebooks

Now you are all set. Just run Jupyter Notebook from anywhere on your system by typing:

$ jupyter notebook

Or add this line to your $HOME/.bashrc file to create a shortcut command called jn:

alias jn='jupyter notebook'

After running the command jn, access https://your-fedora-host.com:8888 from any browser on the network to see the Jupyter user interface. You’ll need to use the password you set up earlier. Start typing some Python code and markup text. This is how it looks:

Jupyter with a simple notebook

In addition to the IPython environment, you’ll also get a web-based Unix terminal provided by terminado. Some people might find this useful, while others find this insecure. You can disable this feature in the config file.

JupyterLab — the next generation of Jupyter

JupyterLab is the next generation of Jupyter, with a better interface and more control over your workspace. It’s currently not RPM-packaged for Fedora at the time of writing, but you can use pip to get it installed easily:

$ pip3 install jupyterlab --user
$ jupyter serverextension enable --py jupyterlab

Then run your regular jupiter notebook command or jn alias. JupyterLab will be accessible from http://your-linux-host.com:8888/lab.

Tools used by data scientists

In this section you can get to know some of these tools, and how to install them. Unless noted otherwise, the module is already packaged for Fedora and was installed as prerequisites for previous components.

Numpy

Numpy is an advanced and C-optimized math library designed to work with large in-memory datasets. It provides advanced multidimensional matrix support and operations, including math functions as log(), exp(), trigonometry etc.

Pandas

In this author’s opinion, Python is THE platform for data science mostly because of Pandas. Built on top of numpy, Pandas makes easy the work of preparing and displaying data. You can think of it as a no-UI spreadsheet, but ready to work with much larger datasets. Pandas helps with data retrieval from a SQL database, CSV or other types of files, columns and rows manipulation, data filtering and, to some extent, data visualization with matplotlib.

Matplotlib

Matplotlib is a library to plot 2D and 3D data. It has great support for notations in graphics, labels and overlays

matplotlib pair of graphics showing a cost function searching its optimal value through a gradient descent algorithm

Seaborn

Built on top of matplotlib, Seaborn’s graphics are optimized for a more statistical comprehension of data. It automatically displays regression lines or Gauss curve approximations of plotted data.

Linear regression visualised with SeaBorn

StatsModels

StatsModels provides algorithms for statistical and econometrics data analysis such as linear and logistic regressions. Statsmodel is also home for the classical family of time series algorithms known as ARIMA.

Normalized number of passengers across time (blue) and ARIMA-predicted number of passengers (red)

Scikit-learn

The central piece of the machine-learning ecosystem, scikit provides predictor algorithms for regression (Elasticnet, Gradient Boosting, Random Forest etc) and classification and clustering (K-means, DBSCAN etc). It features a very well designed API. Scikit also has classes for advanced data manipulation, dataset split into train and test parts, dimensionality reduction and data pipeline preparation.

XGBoost

XGBoost is the most advanced regressor and classifier used nowadays. It’s not part of scikit-learn, but it adheres to scikit’s API. XGBoost is not packaged for Fedora and should be installed with pip. XGBoost can be accelerated with your nVidia GPU, but not through its pip package. You can get this if you compile it yourself against CUDA. Get it with:

$ pip3 install xgboost --user

Imbalanced Learn

imbalanced-learn provides ways for under-sampling and over-sampling data. It is useful in fraud detection scenarios where known fraud data is very small when compared to non-fraud data. In these cases data augmentation is needed for the known fraud data, to make it more relevant to train predictors. Install it with pip:

$ pip3 install imblearn --user

NLTK

The Natural Language toolkit, or NLTK, helps you work with human language data for the purpose of building chatbots (just to cite an example).

SHAP

Machine learning algorithms are very good on predicting, but aren’t good at explaining why they made a prediction. SHAP solves that, by analyzing trained models.

Where SHAP fits into the data analysis process

Install it with pip:

$ pip3 install shap --user

Keras

Keras is a library for deep learning and neural networks. Install it with pip:

$ sudo dnf install python3-h5py
$ pip3 install keras --user

TensorFlow

TensorFlow is a popular neural networks builder. Install it with pip:

$ pip3 install tensorflow --user

Photo courtesy of FolsomNatural on Flickr (CC BY-SA 2.0).

Monday, 01 July

Sunday, 30 June

Saturday, 29 June

Friday, 28 June

16:22

Third Time’s a Charm (A brief history of a gay marriage) [The Cloudflare Blog]

Third Time’s a Charm (A brief history of a gay marriage)

Happy Pride from Proudflare, Cloudflare’s LGBTQIA+ employee resource group. We wanted to share some stories from our members this month which highlight both the struggles behind the LGBTQIA+ rights movement and its successes. This first story is from Lesley.

The moment that crystalised the memory of that day…crystal blue afternoon, bright-coloured autumn leaves, borrowed tables, crockery and cutlery, flowers arranged by a cousin, cake baked by a neighbour, music mixed by a friend... our priest/rabbi a close gay friend with neither  yarmulke nor collar. The venue, a backyard kitty-corner at the home my wife grew up in. Love and good wishes in abundance from a community that supports us and our union. And in the middle of all that, my wife… turning to me and smiling, grass stains on the bottom of her long cream wedding dress after abandoning her heels and dancing barefoot in the grass. As usual, a microphone in hand, bringing life and laughter to all with her charismatic quips.

This was the fall of 2002 and same-sex marriage was legal in 0 of the 50 United States.

Third Time’s a Charm (A brief history of a gay marriage)
Our first marriage in Oct 2002 in Walnut Creek, CA

It was a tough time economically. We had a front row seat to the historic internet boom and bust. My company filed chapter 11 bankruptcy and my customer’s customers were going out of business. I did not anticipate getting a job anytime soon. So after a lot of silver-tongued persuading, I convinced my wife to quit her job, rent out our home, buy a RV. We grabbed our two Australian shepherds and toured the US for a nine-month honeymoon. Forty-five states and 36,000 miles later, we still had some funds left over and I wanted to show Robin my other home, so we travelled to South Africa for 6 weeks. Just before we got on our flight back to the US, we heard news from our family that Gavin Newsom, the then mayor of San Francisco was going to declare same-sex marriage legal and begin issuing marriage licenses in San Francisco. The trip back to the Bay Area took 42 hours door-to-door and had a nine-hour time change. We got home, dropped off our bags and the very next morning, completely jet-lagged, went back into the city to stand in line to get our marriage license.

Our second marriage was a much smaller and more intimate affair

Held in an alcove in the beautiful San Francisco City Hall rotunda with its exquisite architectural design and a view of the grand staircase. It was attended by Robin’s parents and a couple of our close friends that could take the time at such short notice. Knowing that the time window was closing, we grabbed the opportunity to have our marriage recognised legally along with dozens of other jubilant gay couples.

Alas, a short time later, we received an annulment in the mail. We received that, along with an apology and a request that we donate the licensing fees to the city. We were disappointed, but felt our love was strong enough to carry us through and who needed a piece of paper anyway, right?!?

Our attitude changed significantly when we had our son. We had been trying for a while and what finally worked for us was to take my egg along with a sperm bank donation, and impregnate Robin. To this day, my mom says I’m the best delegator she knows. I delegated childbirth. I also delegated all my rights as Joey’s mother. Absent a marriage, in California, the birth mother has all the rights and responsibilities.

Robin had been in a relationship before me where she had planned and had a child with another woman. When they split up, Robin had no rights to see or have access to the child. She also had no obligation to support the child in any way, financial or other.

We wanted to make sure Joey never faced that predicament and without the option for marriage, took the next available avenue. I adopted Joey. Even though he is genetically my child, we had to go through a lengthy and costly procedure to adopt him. We had child protective services inspect our home and come for numerous visits to ensure I would be a “suitable” parent for my child. Eventually, I was granted adoption approval and we went to family court in Martinez where I came before a judge and officially adopted Joey as my son.

In the early summer of 2008, the California Supreme Court declared same-sex marriage legal in California. We were the second state to make it legal after Massachusetts. The court found that barring same-sex couples from marriage violated California State’s Constitution.  

Queue marriage number three…

In the period between the declaration and Prop 8 passing, Robin and I joined 18,000 gay couples that tied the knot.

At this point, we were pros at getting married and went with casual jeans and white cotton shirts, which was easier for everyone including our son who attended our wedding. This was the first marriage where our legal rights and responsibilities actually stuck. These rights were only valid in California however, so on any travel outside of our beautiful state our union would be considered illegitimate. From a tax perspective, it was a real adventure with every advisor having a different take on the way we should file our taxes. Federally our marriage was not recognized, but in California it was. This lead to a lot of confusion and added expenses every April.

Little did we know the backlash that our happy/gay marriages would cause. The religious, conservative right came back at us with Prop 8 for daring to expect equality.

From our perspective, there was no other way to view this than vengeful and born out of malice for gays. Why would these people care that we wanted to live together and have the protections of marriage? I saw this as a group of people wanting to impose their religion and view of what a marriage should be on us. We were on vacation in Hawaii when the election results were announced, sweet with Barack Obama being elected and so, so very bitter with Prop 8 passing.

Prop 8 provoked a lot of soul-searching for me. I was very angry and had a general distrust of people that I had never felt before. I would be in the supermarket line and wonder who there may or may not have voted against my marriage. It was deeply personal and hurtful. We had Mormon friends, who are for the most part wonderful and whose company we enjoyed. Knowing the extreme measures their community went to to ensure Prop 8 passed, cut me deeply.  Catholics and conservatives who are both family and friends, went out of their way to harm me and my family and to make our lives more difficult because they believed we were sinners and not worthy of equality in the eyes of the law. Fortunately, our marriage was grandfathered in, so our rights in California were preserved.

The one ray of light was watching our allies stand up and come to our defense. In my life, I’ve pretty much always been part of the privileged class. I’m a white woman with a degree who grew up in an affluent home. I had never personally experienced discrimination or felt part of a marginalized minority. To have allies that stepped up and argued on our behalf brought tears to my eyes. We would not have the rights we have today without those allies. This was a significant lesson for me to learn. I will always stand up for the disenfranchised and make my voice heard to defend those who cannot defend themselves as others have done for me and my family. I know how much it means.

Life went on, and the fight went on, gathering momentum as more states legalized same sex marriage initially through court action and then through popular vote.

June 26, 2015 was a triumphant day

We celebrated a landmark victory for gay rights as the U.S. Supreme Court ruled that same-sex marriage was a constitutional right and DOMA (the Defense of Marriage Act) was repealed. Finally, our marriage was recognised in every state in the Union.

We still consider our first wedding as the day we got married. We wrote our own vows and they have traveled with us from home to home framed with pride on the wall in our bedroom.

In October this year, Robin and I will celebrate our 17th wedding anniversary. We’ve been together 19 years in total and it’s been quite a ride. I promised Robin I’d marry her seven times, we still have a way to go!

Third Time’s a Charm (A brief history of a gay marriage)
On vacation, May 2019 in Bora Bora

02:00

Upcoming features in Fedora 31 Workstation [Fedora Magazine]

The Fedora Workstation edition is a fabulous operating system that includes everything a developer needs. But it’s also a perfect solution for anyone who wants to be productive online with their desktop or laptop computer. It features a sleek interface and an enormous catalog of ready-to-install software. Recently, Christian Schaller shared information about what’s coming in the Workstation for Fedora 31.

Fedora 31 is currently scheduled for release in late October 2019. With it, as usual, will come an assortment of new and refreshed free and open source software. This includes the GNOME desktop which is planned to be updated to the latest 3.34.

Under the hood of the desktop, many intrepid open source developers have been toiling away. They’ve been working on things like:

  • The Wayland desktop compositor
  • Working with NVidia to provide better driver support
  • PipeWire, for better audio and video handling
  • Expanded Flatpak support and features
  • A container toolbox
  • …and much more!

Long-time and keen readers of the Magazine probably know that Christian is deeply involved in the Workstation effort. He heads up the desktop engineering groups at Red Hat. But he’s also involved heavily in the community Workstation Working Group, which guides these efforts as well. As an experienced developer himself, he brings his expertise to the open source community every day to build a better desktop.

For all the details, check out Christian’s detailed and informative blog post on Fedora 31 Workstation. And stay tuned to the Magazine for more about the upcoming release in the next few months!

Thursday, 27 June

02:00

RPM packages explained [Fedora Magazine]

Perhaps the best known way the Fedora community pursues its mission of promoting free and open source software and content is by developing the Fedora software distribution. So it’s not a surprise at all that a very large proportion of our community resources are spent on this task. This post summarizes how this software is “packaged” and the underlying tools such as rpm that make it all possible.

RPM: the smallest unit of software

The editions and flavors (spins/labs/silverblue) that users get to choose from are all very similar. They’re all composed of various software that is mixed and matched to work well together. What differs between them is the exact list of tools that goes into each. That choice depends on the use case that they target. The basic unit of all of these is an RPM package file.

RPM files are archives that are similar to ZIP files or tarballs. In fact, they uses compression to reduce the size of the archive. However, along with files, RPM archives also contain metadata about the package. This can be queried using the rpm tool:


$ rpm -q fpaste
fpaste-0.3.9.2-2.fc30.noarch

$ rpm -qi fpaste
Name        : fpaste
Version     : 0.3.9.2
Release     : 2.fc30
Architecture: noarch
Install Date: Tue 26 Mar 2019 08:49:10 GMT
Group       : Unspecified
Size        : 64144
License     : GPLv3+
Signature   : RSA/SHA256, Thu 07 Feb 2019 15:46:11 GMT, Key ID ef3c111fcfc659b9
Source RPM  : fpaste-0.3.9.2-2.fc30.src.rpm
Build Date  : Thu 31 Jan 2019 20:06:01 GMT
Build Host  : buildhw-07.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://pagure.io/fpaste
Bug URL     : https://bugz.fedoraproject.org/fpaste
Summary     : A simple tool for pasting info onto sticky notes instances
Description :
It is often useful to be able to easily paste text to the Fedora
Pastebin at http://paste.fedoraproject.org and this simple script
will do that and return the resulting URL so that people may
examine the output. This can hopefully help folks who are for
some reason stuck without X, working remotely, or any other
reason they may be unable to paste something into the pastebin

$ rpm -ql fpaste
/usr/bin/fpaste
/usr/share/doc/fpaste
/usr/share/doc/fpaste/README.rst
/usr/share/doc/fpaste/TODO
/usr/share/licenses/fpaste
/usr/share/licenses/fpaste/COPYING
/usr/share/man/man1/fpaste.1.gz

When an RPM package is installed, the rpm tools know exactly what files were added to the system. So, removing a package also removes these files, and leaves the system in a consistent state. This is why installing software using rpm is preferred over installing software from source whenever possible.

Dependencies

Nowadays, it is quite rare for software to be completely self-contained. Even fpaste, a simple one file Python script, requires that the Python interpreter be installed. So, if the system does not have Python installed (highly unlikely, but possible), fpaste cannot be used. In packager jargon, we say that “Python is a run-time dependency of fpaste“.

When RPM packages are built (the process of building RPMs is not discussed in this post), the generated archive includes all of this metadata. That way, the tools interacting with the RPM package archive know what else must must be installed so that fpaste works correctly:


$ rpm -q --requires fpaste
/usr/bin/python3
python3
rpmlib(CompressedFileNames) &lt;= 3.0.4-1
rpmlib(FileDigests) &lt;= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) &lt;= 4.0-1
rpmlib(PayloadIsXz) &lt;= 5.2-1

$ rpm -q --provides fpaste
fpaste = 0.3.9.2-2.fc30

$ rpm -qi python3
Name        : python3
Version     : 3.7.3
Release     : 3.fc30
Architecture: x86_64
Install Date: Thu 16 May 2019 18:51:41 BST
Group       : Unspecified
Size        : 46139
License     : Python
Signature   : RSA/SHA256, Sat 11 May 2019 17:02:44 BST, Key ID ef3c111fcfc659b9
Source RPM  : python3-3.7.3-3.fc30.src.rpm
Build Date  : Sat 11 May 2019 01:47:35 BST
Build Host  : buildhw-05.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://www.python.org/
Bug URL     : https://bugz.fedoraproject.org/python3
Summary     : Interpreter of the Python programming language
Description :
Python is an accessible, high-level, dynamically typed, interpreted programming
language, designed with an emphasis on code readability.
It includes an extensive standard library, and has a vast ecosystem of
third-party libraries.

The python3 package provides the "python3" executable: the reference
interpreter for the Python language, version 3.
The majority of its standard library is provided in the python3-libs package,
which should be installed automatically along with python3.
The remaining parts of the Python standard library are broken out into the
python3-tkinter and python3-test packages, which may need to be installed
separately.

Documentation for Python is provided in the python3-docs package.

Packages containing additional libraries for Python are generally named with
the "python3-" prefix.

$ rpm -q --provides python3
python(abi) = 3.7
python3 = 3.7.3-3.fc30
python3(x86-64) = 3.7.3-3.fc30
python3.7 = 3.7.3-3.fc30
python37 = 3.7.3-3.fc30

Resolving RPM dependencies

While rpm knows the required dependencies for each archive, it does not know where to find them. This is by design: rpm only works on local files and must be told exactly where they are. So, if you try to install a single RPM package, you get an error if rpm cannot find the package’s run-time dependencies. This example tries to install a package downloaded from the Fedora package set:


$ ls
python3-elephant-0.6.2-3.fc30.noarch.rpm

$ rpm -qpi python3-elephant-0.6.2-3.fc30.noarch.rpm
Name        : python3-elephant
Version     : 0.6.2
Release     : 3.fc30
Architecture: noarch
Install Date: (not installed)
Group       : Unspecified
Size        : 2574456
License     : BSD
Signature   : (none)
Source RPM  : python-elephant-0.6.2-3.fc30.src.rpm
Build Date  : Fri 14 Jun 2019 17:23:48 BST
Build Host  : buildhw-02.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : http://neuralensemble.org/elephant
Bug URL     : https://bugz.fedoraproject.org/python-elephant
Summary     : Elephant is a package for analysis of electrophysiology data in Python
Description :
Elephant - Electrophysiology Analysis Toolkit Elephant is a package for the
analysis of neurophysiology data, based on Neo.

$ rpm -qp --requires python3-elephant-0.6.2-3.fc30.noarch.rpm
python(abi) = 3.7
python3.7dist(neo) >= 0.7.1
python3.7dist(numpy) >= 1.8.2
python3.7dist(quantities) >= 0.10.1
python3.7dist(scipy) >= 0.14.0
python3.7dist(six) >= 1.10.0
rpmlib(CompressedFileNames) &lt;= 3.0.4-1
rpmlib(FileDigests) &lt;= 4.6.0-1
rpmlib(PartialHardlinkSets) &lt;= 4.0.4-1
rpmlib(PayloadFilesHavePrefix) &lt;= 4.0-1
rpmlib(PayloadIsXz) &lt;= 5.2-1

$ sudo rpm -i ./python3-elephant-0.6.2-3.fc30.noarch.rpm
error: Failed dependencies:
        python3.7dist(neo) >= 0.7.1 is needed by python3-elephant-0.6.2-3.fc30.noarch
        python3.7dist(quantities) >= 0.10.1 is needed by python3-elephant-0.6.2-3.fc30.noarch

In theory, one could download all the packages that are required for python3-elephant, and tell rpm where they all are, but that isn’t convenient. What if python3-neo and python3-quantities have other run-time requirements and so on? Very quickly, the dependency chain can get quite complicated.

Repositories

Luckily, dnf and friends exist to help with this issue. Unlike rpm, dnf is aware of repositories. Repositories are collections of packages, with metadata that tells dnf what these repositories contain. All Fedora systems come with the default Fedora repositories enabled by default:


$ sudo dnf repolist
repo id              repo name                             status
fedora               Fedora 30 - x86_64                    56,582
fedora-modular       Fedora Modular 30 - x86_64               135
updates              Fedora 30 - x86_64 - Updates           8,573
updates-modular      Fedora Modular 30 - x86_64 - Updates     138
updates-testing      Fedora 30 - x86_64 - Test Updates      8,458

There’s more information on these repositories, and how they can be managed on the Fedora quick docs.

dnf can be used to query repositories for information on the packages they contain. It can also search them for software, or install/uninstall/upgrade packages from them:


$ sudo dnf search elephant
Last metadata expiration check: 0:05:21 ago on Sun 23 Jun 2019 14:33:38 BST.
============================================================================== Name &amp; Summary Matched: elephant ==============================================================================
python3-elephant.noarch : Elephant is a package for analysis of electrophysiology data in Python
python3-elephant.noarch : Elephant is a package for analysis of electrophysiology data in Python

$ sudo dnf list \*elephant\*
Last metadata expiration check: 0:05:26 ago on Sun 23 Jun 2019 14:33:38 BST.
Available Packages
python3-elephant.noarch      0.6.2-3.fc30      updates-testing
python3-elephant.noarch      0.6.2-3.fc30              updates

Installing dependencies

When installing the package using dnf now, it resolves all the required dependencies, then calls rpm to carry out the transaction:


$ sudo dnf install python3-elephant
Last metadata expiration check: 0:06:17 ago on Sun 23 Jun 2019 14:33:38 BST.
Dependencies resolved.
==============================================================================================================================================================================================
 Package                                      Architecture                     Version                                                        Repository                                 Size
==============================================================================================================================================================================================
Installing:
 python3-elephant                             noarch                           0.6.2-3.fc30                                                   updates-testing                           456 k
Installing dependencies:
 python3-neo                                  noarch                           0.8.0-0.1.20190215git49b6041.fc30                              fedora                                    753 k
 python3-quantities                           noarch                           0.12.2-4.fc30                                                  fedora                                    163 k
Installing weak dependencies:
 python3-igor                                 noarch                           0.3-5.20150408git2c2a79d.fc30                                  fedora                                     63 k

Transaction Summary
==============================================================================================================================================================================================
Install  4 Packages

Total download size: 1.4 M
Installed size: 7.0 M
Is this ok [y/N]: y
Downloading Packages:
(1/4): python3-igor-0.3-5.20150408git2c2a79d.fc30.noarch.rpm                                                                                                  222 kB/s |  63 kB     00:00
(2/4): python3-elephant-0.6.2-3.fc30.noarch.rpm                                                                                                               681 kB/s | 456 kB     00:00
(3/4): python3-quantities-0.12.2-4.fc30.noarch.rpm                                                                                                            421 kB/s | 163 kB     00:00
(4/4): python3-neo-0.8.0-0.1.20190215git49b6041.fc30.noarch.rpm                                                                                               840 kB/s | 753 kB     00:00
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                         884 kB/s | 1.4 MB     00:01
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                      1/1
  Installing       : python3-quantities-0.12.2-4.fc30.noarch                                                                                                                              1/4
  Installing       : python3-igor-0.3-5.20150408git2c2a79d.fc30.noarch                                                                                                                    2/4
  Installing       : python3-neo-0.8.0-0.1.20190215git49b6041.fc30.noarch                                                                                                                 3/4
  Installing       : python3-elephant-0.6.2-3.fc30.noarch                                                                                                                                 4/4
  Running scriptlet: python3-elephant-0.6.2-3.fc30.noarch                                                                                                                                 4/4
  Verifying        : python3-elephant-0.6.2-3.fc30.noarch                                                                                                                                 1/4
  Verifying        : python3-igor-0.3-5.20150408git2c2a79d.fc30.noarch                                                                                                                    2/4
  Verifying        : python3-neo-0.8.0-0.1.20190215git49b6041.fc30.noarch                                                                                                                 3/4
  Verifying        : python3-quantities-0.12.2-4.fc30.noarch                                                                                                                              4/4

Installed:
  python3-elephant-0.6.2-3.fc30.noarch   python3-igor-0.3-5.20150408git2c2a79d.fc30.noarch   python3-neo-0.8.0-0.1.20190215git49b6041.fc30.noarch   python3-quantities-0.12.2-4.fc30.noarch

Complete!

Notice how dnf even installed python3-igor, which isn’t a direct dependency of python3-elephant.

DnfDragora: a graphical interface to DNF

While technical users may find dnf straightforward to use, it isn’t for everyone. Dnfdragora addresses this issue by providing a graphical front end to dnf.

dnfdragora (version 1.1.1-2 on Fedora 30) listing all the packages installed on a system.

From a quick look, dnfdragora appears to provide all of dnf‘s main functions.

There are other tools in Fedora that also manage packages. GNOME Software, and Discover are two examples. GNOME Software is focused on graphical applications only. You can’t use the graphical front end to install command line or terminal tools such as htop or weechat. However, GNOME Software does support the installation of Flatpaks and Snap applications which dnf does not. So, they are different tools with different target audiences, and so provide different functions.

This post only touches the tip of the iceberg that is the life cycle of software in Fedora. This article explained what RPM packages are, and the main differences between using rpm and using dnf.

In future posts, we’ll speak more about:

  • The processes that are needed to create these packages
  • How the community tests them to ensure that they are built correctly
  • The infrastructure that the community uses to get them to community users in future posts.

Wednesday, 26 June

16:22

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday [The Cloudflare Blog]

A recap on what happened Monday

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

On Monday we wrote about a painful Internet wide route leak. We wrote that this should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. That blog entry came out around 19:58 UTC, just over seven hours after the route leak finished (which will we see below was around 12:39 UTC). Today we will dive into the archived routing data and analyze it. The format of the code below is meant to use simple shell commands so that any reader can follow along and, more importantly, do their own investigations on the routing tables.

This was a very public BGP route leak event. It was both reported online via many news outlets and the event’s BGP data was reported via social media as it was happening. Andree Toonk tweeted a quick list of 2,400 ASNs that were affected.

This blog contains a large number of acronyms and those are explained at the end of the blog.

Using RIPE NCC archived data

The RIPE NCC operates a very useful archive of BGP routing. It runs collectors globally and provides an API for querying the data. More can be seen at https://stat.ripe.net/. In the world of BGP all routing is public (within the ability of anyone collecting data to have enough collections points). The archived data is very valuable for research and that’s what we will do in this blog. The site can create some very useful data visualizations.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Dumping the RIPEstat data for this event

Presently, the RIPEstat data gets ingested around eight to twelve hours after real-time. It’s not meant to be a real-time service. The data can be queried in many ways, including a full web interface and an API. We are using the API to extract the data in a JSON format.

We are going to focus only on the Cloudflare routes that were leaked. Many other ASNs were leaked (see the tweet above); however, we want to deal with a finite data set and focus on what happened to Cloudflare’s routing. All the commands below can be run with ease on many systems. Both the scripts and the raw data file are now available on GitHub. The following was done on MacBook Pro running macOS Mojave.

First we collect 24 hours of route announcements and AS-PATH data that RIPEstat sees coming from AS13335 (Cloudflare).

$ # Collect 24 hours of data - more than enough
$ ASN="AS13335"
$ START="2019-06-24T00:00:00"
$ END="2019-06-25T00:00:00"
$ ARGS="resource=${ASN}&starttime=${START}&endtime=${END}"
$ URL="https://stat.ripe.net/data/bgp-updates/data.json?${ARGS}"
$ # Fetch the data from RIPEstat
$ curl -sS "${URL}" | jq . > 13335-routes.json
$ ls -l 13335-routes.json
-rw-r--r--  1 martin  staff  339363899 Jun 25 08:47 13335-routes.json
$

That’s 340MB of data - which seems like a lot, but it contains plenty of white space and plenty of data we just don’t need. Our second task is to reduce this raw data down to just the required data - that’s timestamps, actual routes, and AS-PATH. The third item will be very useful. Note we are using jq, which can be installed on macOS with the  brew package manager.

$ # Extract just the times, routes, and AS-PATH
$ jq -rc '.data.updates[]|.timestamp,.attrs.target_prefix,.attrs.path' < 13335-routes.json | paste - - - > 13335-listing-a.txt
$ wc -l 13335-listing-a.txt
691318 13335-listing-a.txt
$

We are down to just below seven hundred thousand routing events, however, that’s not a leak, that’s everything that includes Cloudflare’s ASN (the number 13335 above). For that we need to go back to Monday’s blog and realize it was AS396531 (Allegheny Technologies) that showed up with 701 (Verizon) in the leak. Now we reduce the data further:

$ # Extract the route leak 701,396531
$ # AS701 is Verizon and AS396531 is Allegheny Technologies
$ egrep '701,396531' < 13335-listing-a.txt > 13335-listing-b.txt
$ wc -l 13335-listing-b.txt
204568 13335-listing-b.txt
$

At 204 thousand data points, we are looking better. It’s still a lot of data because BGP can be very chatty if topology is changing. A route leak will cause exactly that. Now let’s see how many routes were affected:

$ # Extract the actual routes affected by the route leak
$ cut -f2 < 13335-listing-b.txt | sort -V -u > 13335-listing-c.txt
$ wc -l 13335-listing-c.txt
101 13335-listing-c.txt
$

It’s a much smaller number. We now have a listing of at least 101 routes that were leaked via Verizon. This may not be the full list because route collectors like RIPEstat don’t have direct feeds from Verizon, so this data is a blended view with Verizon’s path and other paths. We can see that if we look at the AS-PATH in the above files. Please note that I had a typo in this script when this blog was first published and only 20 routes showed up because the -n vs -V option was used on sort. Now the list is correct with 101 affected routes. Please see this short article from stackoverflow to see the issue.

Here’s a partial listing of affected routes.

$ cat 13335-listing-c.txt
8.39.214.0/24
8.42.245.0/24
8.44.58.0/24
...
104.16.80.0/21
104.17.168.0/21
104.18.32.0/21
104.19.168.0/21
104.20.64.0/21
104.22.8.0/21
104.23.128.0/21
104.24.112.0/21
104.25.144.0/21
104.26.0.0/21
104.27.160.0/21
104.28.16.0/21
104.31.0.0/21
141.101.120.0/23
162.159.224.0/21
172.68.60.0/22
172.69.116.0/22
...
$

This is an interesting list, as some of these routes do not originate from Cloudflare’s network, however, they show up with AS13335 (our ASN) as the originator. For example, the 104.26.0.0/21 route is not announced from our network, but we do announce 104.26.0.0/20 (which covers that route). More importantly, we have an IRR (Internet Routing Registries) route object plus an RPKI ROA for that block. Here’s the IRR object:

route:          104.26.0.0/20
origin:         AS13335
source:         ARIN

And here’s the RPKI ROA. This ROA has Max Length set to 20, so no smaller route should be accepted.

Prefix:       104.26.0.0/20
Max Length:   /20
ASN:          13335
Trust Anchor: ARIN
Validity:     Thu, 02 Aug 2018 04:00:00 GMT - Sat, 31 Jul 2027 04:00:00 GMT
Emitted:      Thu, 02 Aug 2018 21:45:37 GMT
Name:         535ad55d-dd30-40f9-8434-c17fc413aa99
Key:          4a75b5de16143adbeaa987d6d91e0519106d086e
Parent Key:   a6e7a6b44019cf4e388766d940677599d0c492dc
Path:         rsync://rpki.arin.net/repository/arin-rpki-ta/5e4a23ea-...

The Max Length field in an ROA says what the minimum size of an acceptable announcement is. The fact that this is a /20 route with a /20 Max Length says that a /21 (or /22 or /23 or /24) within this IP space isn’t allowed. Looking further at the route list above we get the following listing:

Route Seen            Cloudflare IRR & ROA    ROA Max Length
104.16.80.0/21    ->  104.16.80.0/20          /20
104.17.168.0/21   ->  104.17.160.0/20         /20
104.18.32.0/21    ->  104.18.32.0/20          /20
104.19.168.0/21   ->  104.19.160.0/20         /20
104.20.64.0/21    ->  104.20.64.0/20          /20
104.22.8.0/21     ->  104.22.0.0/20           /20
104.23.128.0/21   ->  104.23.128.0/20         /20
104.24.112.0/21   ->  104.24.112.0/20         /20
104.25.144.0/21   ->  104.25.144.0/20         /20
104.26.0.0/21     ->  104.26.0.0/20           /20
104.27.160.0/21   ->  104.27.160.0/20         /20
104.28.16.0/21    ->  104.28.16.0/20          /20
104.31.0.0/21     ->  104.31.0.0/20           /20

So how did all these /21’s show up? That’s where we dive into the world of BGP route optimization systems and their propensity to synthesize routes that should not exist. If those routes leak (and it’s very clear after this week that they can), all hell breaks loose. That can be compounded when not one, but two ISPs allow invalid routes to be propagated outside their autonomous network. We will explore the AS-PATH further down this blog.

More than 20 years ago, RFC1997 added the concept of communities to BGP. Communities are a way of tagging or grouping route advertisements. Communities are often used to label routes so that specific handling policies can be applied. RFC1997 includes a small number of universal well-known communities. One of these is the NO_EXPORT community, which has the following specification:

    All routes received carrying a communities attribute
    containing this value MUST NOT be advertised outside a BGP
    confederation boundary (a stand-alone autonomous system that
    is not part of a confederation should be considered a
    confederation itself).

The use of the NO_EXPORT community is very common within BGP enabled networks and is a community tag that would have helped alleviate this route leak immensely.

How BGP route optimization systems work (or don’t work in this case) can be a subject for a whole other blog entry.

Timing of the route leak

As we saved away the timestamps in the JSON file and in the text files, we can confirm the time for every route in the route leak by looking at the first and the last timestamp of a route in the data. We saved data from 00:00:00 UTC until 00:00:00 the next day, so we know we have covered the period of the route leak. We write a script that checks the first and last entry for every route and report the information sorted by start time:

$ # Extract the timing of the route leak
$ while read cidr
do
  echo $cidr
  fgrep $cidr < 13335-listing-b.txt | head -1 | cut -f1
  fgrep $cidr < 13335-listing-b.txt | tail -1 | cut -f1
done < 13335-listing-c.txt |\
paste - - - | sort -k2,3 | column -t | sed -e 's/2019-06-24T//g'
...
104.25.144.0/21   10:34:25  12:38:54
104.22.8.0/21     10:34:27  12:29:39
104.20.64.0/21    10:34:27  12:30:00
104.23.128.0/21   10:34:27  12:30:34
141.101.120.0/23  10:34:27  12:30:39
162.159.224.0/21  10:34:27  12:30:39
104.18.32.0/21    10:34:29  12:30:34
104.24.112.0/21   10:34:29  12:30:34
104.27.160.0/21   10:34:29  12:30:34
104.28.16.0/21    10:34:29  12:30:34
104.31.0.0/21     10:34:29  12:30:34
8.39.214.0/24     10:34:31  12:19:24
104.26.0.0/21     10:34:36  12:29:53
172.68.60.0/22    10:34:38  12:19:24
172.69.116.0/22   10:34:38  12:19:24
8.44.58.0/24      10:34:38  12:19:24
8.42.245.0/24     11:52:49  11:53:19
104.17.168.0/21   12:00:13  12:29:34
104.16.80.0/21    12:00:13  12:30:00
104.19.168.0/21   12:09:39  12:29:34
...
$

Now we know the times. The route leak started at 10:34:25 UTC (just before lunchtime London time) on 2019-06-24 and ended at 12:38:54 UTC. That’s a hair over two hours. Here’s that same time data in a graphical form showing the near-instant start of the event and the duration of each route leaked:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We can also go back to RIPEstat and look at the activity graph for Cloudflare’s AS13335 network:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Clearly between 10:30 UTC and 12:40 UTC there’s a lot of route activity - far more than normal.

Note that as we mentioned above, RIPEstat doesn’t get a full view of Verizon’s network routing and hence some of the propagated routes won’t show up.

Drilling down on the AS-PATH part of the data

Having the routes is useful, but now we want to look at the paths of these leaked routes to see which ASNs are involved. We knew the offending ASNs during the route leak on Monday. Now we want to dig deeper using the archived data. This allows us to see the extent and reach of this route leak.

$ # Extract the AS-PATH of the route leak
$ # Use the list of routes to extract the full AS-PATH
$ # Merge the results together to show an amalgamation of paths.
$ # We know (luckily) the last few ASNs in the AS-PATH are consistent
$ cut -f3 < 13335-listing-b.txt | tr -d '[\[\]]' |\
awk '{
  n=split($0, a, ",");
  printf "%50s\n",
    a[n-5] "_" a[n-4] "_" a[n-3] "_" a[n-2] "_" a[n-1] "_" a[n];
}' | sort -u
   174_701_396531_33154_3356_13335
   2497_701_396531_33154_174_13335
   577_701_396531_33154_3356_13335
   6939_701_396531_33154_174_13335
  1239_701_396531_33154_3356_13335
  1273_701_396531_33154_3356_13335
  1280_701_396531_33154_3356_13335
  2497_701_396531_33154_3356_13335
  2516_701_396531_33154_3356_13335
  3320_701_396531_33154_3356_13335
  3491_701_396531_33154_3356_13335
  4134_701_396531_33154_3356_13335
  4637_701_396531_33154_3356_13335
  6453_701_396531_33154_3356_13335
  6461_701_396531_33154_3356_13335
  6762_701_396531_33154_3356_13335
  6830_701_396531_33154_3356_13335
  6939_701_396531_33154_3356_13335
  7738_701_396531_33154_3356_13335
 12956_701_396531_33154_3356_13335
 17639_701_396531_33154_3356_13335
 23148_701_396531_33154_3356_13335
$

This script clearly shows the AS-PATH of the leaked routes. It’s very consistent. Reading from the back of the line to the front, we have 13335 (Cloudflare), 3356 or 174 (Level3/CenturyLink or Cogent - both tier 1 transit providers for Cloudflare). So far, so good. Then we have 33154 (DQE Communications) and 396531 (Allegheny Technologies Inc) which is still technically not a leak, but trending that way. The reason why we can state this is still technically not a leak is because we don’t know the relationship between those two ASs. It’s possible they have a mutual transit agreement between them. That’s up to them.

Back to the AS-PATH’s. Still reading leftwards, we see 701 (Verizon), which is very very bad and clear evidence of a leak. It’s a leak for two reasons. First, this matches the path when a transit is leaking a non-customer route from a customer. This Verizon customer does not have 13335 (Cloudflare) listed as a customer. Second, the route contains within its path a tier 1 ASN. This is the point where a route leak should have been absolutely squashed by filtering on the customer BGP session. Beyond this point there be dragons.

And dragons there be! Everything above is about how Verizon filtered (or didn’t filter) its customer. What follows the 701 (i.e the number to the left of it) is the peers or customers of Verizon that have accepted these leaked routes. They are mainly other tier 1 networks of Verizon in this list: 174 (Cogent), 1239 (Sprint), 1273 (Vodafone), 3320 (DTAG), 3491 (PCCW), 6461 (Zayo), 6762 (Telecom Italia), etc.

What’s missing from that list are three networks worthy of mentioning - 1299 (Telia), 2914 (NTT), and 7018 (AT&T). All three implement a very simple AS-PATH filter which saved the day for their network. They do not allow one tier 1 ISP to send them a route which has another tier 1 further down the path. That’s because when that happens, it’s officially a leak as each tier 1 is fully connected to all other tier 1’s (which is part of the definition of a tier 1 network). The topology of the Internet’s global BGP routing tables simply states that if you see another tier 1 in the path, then it’s a bad route and it should be filtered away.

Additionally we know that 7018 (AT&T) operates a network which drops RPKI invalids. Because Cloudflare routes are RPKI signed, this also means that AT&T would have dropped these routes when it receives them from Verizon. This shows a clear win for RPKI (and for AT&T when you see their bandwidth graph below)!

That all said, keep in mind we are still talking about routes that Cloudflare didn't announce. They all came from the route optimizer.

What should 701 Verizon network accept from their customer 396531?

This is a great question to ask. Normally we would look at the IRR (Internet Routing Registries) to see what policy an ASN wants for it’s routes.

$ whois -h whois.radb.net AS396531 ; the Verizon customer
%  No entries found for the selected source(s).
$ whois -h whois.radb.net AS33154  ; the downstream of that customer
%  No entries found for the selected source(s).
$ 

That’s enough to say that we should not be seeing this ASN anywhere on the Internet, however, we should go further into checking this. As we know the ASN of the network, we can search for any routes that are listed for that ASN. We find one:

$ whois -h whois.radb.net ' -i origin AS396531' | egrep '^route|^origin|^mnt-by|^source'
route:          192.92.159.0/24
origin:         AS396531
mnt-by:         MNT-DCNSL
source:         ARIN
$

More importantly, now we have a maintainer (the owner of the routing registry entries). We can see what else is there for that network and we are specifically looking for this:

$ whois -h whois.radb.net ' -i mnt-by -T as-set MNT-DCNSL' | egrep '^as-set|^members|^mnt-by|^source'
as-set:         AS-DQECUST
members:        AS4130, AS5050, AS11199, AS11360, AS12017, AS14088, AS14162,
                AS14740, AS15327, AS16821, AS18891, AS19749, AS20326,
                AS21764, AS26059, AS26257, AS26461, AS27223, AS30168,
                AS32634, AS33039, AS33154, AS33345, AS33358, AS33504,
                AS33726, AS40549, AS40794, AS54552, AS54559, AS54822,
                AS393456, AS395440, AS396531, AS15204, AS54119, AS62984,
                AS13659, AS54934, AS18572, AS397284
mnt-by:         MNT-DCNSL
source:         ARIN
$

This object is important. It lists all the downstream ASNs that this network is expected to announce to the world. It does not contain Cloudflare’s ASN (or any of the leaked ASNs). Clearly this as-set was not used for any BGP filtering.

Just for completeness the same exercise can be done for the other ASN (the downstream of the customer of Verizon). In this case, we just searched for the maintainer object (as there are plenty of route and route6 objects listed).

$ whois -h whois.radb.net ' -i origin AS33154' | egrep '^mnt-by' | sort -u
mnt-by:         MNT-DCNSL
mnt-by:     MAINT-AS3257
mnt-by:     MAINT-AS5050
$

None of these maintainers are directly related to 33154 (DQE Communications). They have been created by other parties and hence they become a dead-end in that search.

It’s worth doing a secondary search to see if any as-set object exists with 33154 or 396531 included. We turned to the most excellent IRR Explorer website run by NLNOG. It provides deep insight into the routing registry data. We did a simple search for 33154 using http://irrexplorer.nlnog.net/search/33154 and we found these as-set objects.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

It’s interesting to see this ASN listed in other as-set’s but none are important or related to Monday’s route-leak. Next we looked at 396531

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This shows that there’s nowhere else we need to check. AS-DQECUST is the as-set macro that controls (or should control) filtering for any transit provider of their network.

The summary of all the investigation is a solid statement that no Cloudflare routes or ASNs are listed anywhere within the routing registry data for the customer of Verizon. As there were 2,300 ASNs listed in the tweet above, we can conclusively state no filtering was in place and hence this route leak went on its way unabated.

IPv6? Where is the IPv6 route leak?

In what could be considered the only plus from Monday’s route leak, we can confirm that there was no route leak within IPv6 space. Why?

It turns out that 396531 (Allegheny Technologies Inc) is a network without IPv6 enabled. Normally you would hear Cloudflare chastise anyone that’s yet to enable IPv6, however, in this case we are quite happy that one of the two protocol families survived. IPv6 was stable during this route leak, which now can be called an IPv4-only route leak.

Yet that’s not really the whole story. Let’s look at the percentage of traffic Cloudflare sends Verizon that’s IPv6 (vs IPv4). Normally the IPv4/IPv6 percentage holds steady.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This uptick in IPv6 traffic could be the direct result of Happy Eyeballs on mobile handsets picking a working IPv6 path into Cloudflare vs a non-working IPv4 path. Happy Eyeballs is meant to protect against IPv6 failure, however in this case it’s doing a wonderful job in protecting from an IPv4 failure. Yet we have to be careful with this graph because after further thought and investigation the percentage only increased because IPv4 reduces. Sometimes graphs can be misinterpreted, yet Happy Eyeballs still did a good job even as end users were being affected.

Happy Eyeballs, described in RFC8305, is a mechanism where a client (lets say a mobile device) tries to connect to a website both on IPv4 and IPv6 simultaneously. IPv6 is sometimes given a head-start. The theory is that, should a failure exist on one of the paths (sadly IPv6 is the norm), then IPv4 will save the day. Monday was a day of opposites for Verizon.

In fact, enabling IPv6 for mobile users is the one area where we can praise the Verizon network this week (or at least the Verizon mobile network), unlike the residential Verizon networks where IPv6 is almost non-existent.

Using bandwidth graphs to confirm routing leaks and stability.

As we have already stated,Verizon had impacted their own users/customers. Let’s start with their bandwidth graph:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

The red line is 24 June 2019 (00:00 UTC to 00:00 UTC the next day). The gray lines are previous days for comparison. This graph includes both Verizon fixed-line services like FiOS along with mobile.

The AT&T graph is quite different.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

There’s no perturbation. This, along with some direct confirmation, shows that 7018 (AT&T) was not affected. This is an important point.

Going back and looking at a third tier 1 network, we can see 6762 (Telecom Italia) affected by this route leak and yet Cloudflare has a direct interconnect with them.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We will be asking Telecom Italia to improve their route filtering as we now have this data.

Future work that could have helped on Monday

The IETF is doing work in the area of BGP path protection within the Secure Inter-Domain Routing Operations Working Group (sidrops) area. The charter of this IETF group is:

The SIDR Operations Working Group (sidrops) develops guidelines for the operation of SIDR-aware networks, and provides operational guidance on how to deploy and operate SIDR technologies in existing and new networks.

One new effort from this group should be called out to show how important the issue of route leaks like today’s event is. The draft document from Alexander Azimov et al named draft-ietf-sidrops-aspa-profile (ASPA stands for Autonomous System Provider Authorization) extends the RPKI data structures to handle BGP path information. This in ongoing work and Cloudflare and other companies are clearly interested in seeing it progress further.

However, as we said in Monday’s blog and something we should reiterate again and again: Cloudflare encourages all network operators to deploy RPKI now!

Acronyms used in the blog

  • API - Application Programming Interface
  • AS-PATH - The list of ASNs that a routes has traversed so far
  • ASN - Autonomous System Number - A unique number assigned for each network on the Internet
  • BGP - Border Gateway Protocol (version 4) - the core routing protocol for the Internet
  • IETF - Internet Engineering Task Force - an open standards organization
  • IPv4 - Internet Protocol version 4
  • IPv6 - Internet Protocol version 6
  • IRR - Internet Routing Registries - a database of Internet route objects
  • ISP - Internet Service Provider
  • JSON - JavaScript Object Notation - a lightweight data-interchange format
  • RFC - Request For Comment - published by the IETF
  • RIPE NCC - Réseaux IP Européens Network Coordination Centre - a regional Internet registry
  • ROA - Route Origin Authorization - a cryptographically signed attestation of a BGP route announcement
  • RPKI - Resource Public Key Infrastructure - a public key infrastructure framework for routing information
  • Tier 1 - A network that has no default route and peers with all other tier 1's
  • UTC - Coordinated Universal Time - a time standard for clocks and time
  • "there be dragons" - a mistype, as it was meant to be "here be dragons" which means dangerous or unexplored territories

Tuesday, 25 June

18:00

Autoscaling AWS Step Functions Activities [Yelp Engineering and Product Blog]

In an ongoing effort to break down our monolithic applications into microservices here at Yelp, we’ve migrated several business flows to modern architecture using AWS Step Functions. Transactional ordering at Yelp covers a wide variety of verticals, including food (delivery/takeout orders), booking, home services, and many more. These orders are processed via Step Functions, where each is represented as an execution instance of the workflow, as shown below. Figure 1: Illustrative Step Functions Workflow for Transactions Orders Each step in the above workflow is an “activity,” and Yelp implements these activities as batch daemons, which interact with AWS Step Functions...

10:00

Deeper Connection with the Local Tech Community in India [The Cloudflare Blog]

Deeper Connection with the Local Tech Community in India

On June 6th 2019, Cloudflare hosted the first ever customer event in a beautiful and green district of Bangalore, India. More than 60 people, including executives, developers, engineers, and even university students, have attended the half day forum.

Deeper Connection with the Local Tech Community in India

The forum kicked off with a series of presentations on the current DDoS landscape, the cyber security trends, the Serverless computing and Cloudflare’s Workers. Trey Quinn, Cloudflare Global Head of Solution Engineering, gave a brief introduction on the evolution of edge computing.

Deeper Connection with the Local Tech Community in India

We also invited business and thought leaders across various industries to share their insights and best practices on cyber security and performance strategy. Some of the keynote and penal sessions included live demos from our customers.

Deeper Connection with the Local Tech Community in India

At this event, the guests had gained first-hand knowledge on the latest technology. They also learned some insider tactics that will help them to protect their business, to accelerate the performance and to identify the quick-wins in a complex internet environment.

Deeper Connection with the Local Tech Community in India

To conclude the event, we arrange some dinner for the guests to network and to enjoy a cool summer night.

Deeper Connection with the Local Tech Community in India

Through this event, Cloudflare has strengthened the connection with the local tech community. The success of the event cannot be separated from the constant improvement from Cloudflare and the continuous support from our customers in India.

As the old saying goes, भारत महान है (India is great). India is such an important market in the region. Cloudflare will enhance the investment and engagement in providing better services and user experience for India customers.

07:00

Get Cloudflare insights in your preferred analytics provider [The Cloudflare Blog]

Get Cloudflare insights in your preferred analytics provider

Today, we’re excited to announce our partnerships with Chronicle Security, Datadog, Elastic, Looker, Splunk, and Sumo Logic to make it easy for our customers to analyze Cloudflare logs and metrics using their analytics provider of choice. In a joint effort, we have developed pre-built dashboards that are available as a Cloudflare App in each partner’s platform. These dashboards help customers better understand events and trends from their websites and applications on our network.

Get Cloudflare insights in your preferred analytics provider Get Cloudflare insights in your preferred analytics provider Get Cloudflare insights in your preferred analytics provider
Get Cloudflare insights in your preferred analytics provider Get Cloudflare insights in your preferred analytics provider Get Cloudflare insights in your preferred analytics provider



Cloudflare insights in the tools you're already using

Data analytics is a frequent theme in conversations with Cloudflare customers. Our customers want to understand how Cloudflare speeds up their websites and saves them bandwidth, ranks their fastest and slowest pages, and be alerted if they are under attack. While providing insights is a core tenet of Cloudflare's offering, the data analytics market has matured and many of our customers have started using third-party providers to analyze data—including Cloudflare logs and metrics. By aggregating data from multiple applications, infrastructure, and cloud platforms in one dedicated analytics platform, customers can create a single pane of glass and benefit from better end-to-end visibility over their entire stack.

Get Cloudflare insights in your preferred analytics provider


While these analytics platforms provide great benefits in terms of functionality and flexibility, they can take significant time to configure: from ingesting logs, to specifying data models that make data searchable, all the way to building dashboards to get the right insights out of the raw data. We see this as an opportunity to partner with the companies our customers are already using to offer a better and more integrated solution.

Providing flexibility through easy-to-use integrations

To address these complexities of aggregating, managing, and displaying data, we have developed a number of product features and partnerships to make it easier to get insights out of Cloudflare logs and metrics. In February we announced Logpush, which allows customers to automatically push Cloudflare logs to Google Cloud Storage and Amazon S3. Both of these cloud storage solutions are supported by the major analytics providers as a source for collecting logs, making it possible to get Cloudflare logs into an analytics platform with just a few clicks. With today's announcement of Cloudflare's Analytics Partnerships, we're releasing a Cloudflare App—a set of pre-built and fully customizable dashboards—in each partner’s app store or integrations catalogue to make the experience even more seamless.

By using these dashboards, customers can immediately analyze events and trends of their websites and applications without first needing to wade through individual log files and build custom searches. The dashboards feature all 55+ fields available in Cloudflare logs and include 90+ panels with information about the performance, security, and reliability of customers’ websites and applications.

Get Cloudflare insights in your preferred analytics provider

Ultimately, we want to provide flexibility to our customers and make it easier to use Cloudflare with the analytics tools they already use. Improving our customers’ ability to get better data and insights continues to be a focus for us, so we’d love to hear about what tools you’re using—tell us via this brief survey. To learn more about each of our partnerships and how to get access to the dashboards, please visit our developer documentation or contact your Customer Success Manager. Similarly, if you’re an analytics provider who is interested in partnering with us, use the contact form on our analytics partnerships page to get in touch.

Monday, 24 June

14:46

Verizon 和某 BGP 优化器如何在今日大范围重创互联网 [The Cloudflare Blog]

大规模路由泄漏影响包括 Cloudflare 在内的主要互联网服务

事件经过

Verizon 和某 BGP 优化器如何在今日大范围重创互联网

在 UTC 时间今天( 2019年6月24号)10 点 30 分,互联网遭受了一场不小的冲击。通过主要互联网转接服务提供供应商 Verizon (AS701) 转接的许多互联网路由,都被先导至宾夕法尼亚北部的一家小型公司。这相当于  滴滴错误地将整条高速公路导航至某小巷 – 导致大部分互联网用户无法访问 Cloudflare 上的许多网站和许多其他供应商的服务。这本不应该发生,因为 Verizon 不应将这些路径转送到互联网的其余部分。要理解事件发生原因,请继续阅读。

此类不幸事件并不罕见,我们以前曾发布相关博文。这一次,全世界都再次见证了它所带来的严重损害。而Noction “BGP 优化器”产品的涉及,则让今天的事件进一步恶化。这个产品有一个功能:可将接收到的 IP 前缀拆分为更小的组成部分(称为更具体前缀)。例如,我们自己的 IPv4 路由 104.20.0.0/20 被转换为 104.20.0.0/21 和 104.20.8.0/21。就好像通往 ”北京”的路标被两个路标取代,一个是 ”北京东”,另一个是 ”北京西”。通过将这些主要 IP 块拆分为更小的部分,网络可引导其内部的流量,但这种拆分原本不允许向全球互联网广播。正是这方面的原因导致了今天的网络中断。

为了解释后续发生的事情,我们先快速回顾一下互联网基础“地图”的工作原理。“Internet”的字面意思是网络互联,它由叫做自治系统(AS)的网络组成,每个网络都有唯一的标识符,即 AS 编号。所有这些网络都使用边界网关协议(BGP)来进行互连。BGP 将这些网络连接在一起,并构建互联网“地图”,使通信得以一个地方(例如,您的 ISP)传播到地球另一端的热门网站。

通过 BGP,各大网络可以交换路由信息:如何从您所在的地址访问它们。这些路由可能很具体,类似于在 GPS 上查找特定城市,也可能非常宽泛,就如同让 GPS 指向某个省。这就是今天的问题所在。

宾夕法尼亚州的互联网服务供应商(AS33154 - DQE Communications)在其网络中使用了 BGP 优化器,这意味着其网络中有很多更具体的路由。具体路由优先于更一般的路由(类似于在滴滴 中,前往北京火车站的路线比前往北京的路线更具体)。

DQE 向其客户(AS396531 - Allegheny)公布了这些具体路由。然后,所有这些路由信息被转送到他们的另一个转接服务供应商 (AS701 - Verizon),后者则将这些"更好"的路线泄露给整个互联网。由于这些路由更精细、更具体,因此被误认为是“更好”的路由。

泄漏本应止于 Verizon。然而,Verizon 违背了以下所述的众多最佳做法,因缺乏过滤机制而导致此次泄露变成重大事件,影响到众多互联网服务,如亚马逊、Linode 和 Cloudflare

这意味着,Verizon、Allegheny 和 DQE 突然之间必须应对大量试图通过其网络访问这些服务的互联网用户。由于这些网络没有适当的设备来应对流量的急剧增长,导致服务中断。即便他们有足够的能力来应对流量剧增,DQE、Allegheny 和 Verizon 也不应该被允许宣称他们拥有访问 Cloudflare、亚马逊、Linode 等服务的最佳路由...

Verizon 和某 BGP 优化器如何在今日大范围重创互联网
涉及 BGP 优化器的 BGP 泄漏过程

在事件中,我们观察到,在最严重的时段,我们损失了大约 15% 的全球流量。

Verizon 和某 BGP 优化器如何在今日大范围重创互联网
事件中 Cloudflare 的流量水平。

如何防止此类泄漏?

有多种方法可以避免此类泄漏:

一,在配置 BGP 会话的时候,可以对其接收的路由网段的数量设置硬性限制。这意味着,如果接收到的路由网段数超过阈值,路由器可以决定关闭会话。如果 Verizon 有这样的前缀限制,这次事件就不会发生。设置前缀数量限制是最佳做法。像 Verizon 这样的供应商可以无需付出任何成本便可以设置。没有任何合理的理由可以解释为什么他们没有设置这种限制的原因,只能归咎为草率或懒惰可能就是因为草率或懒惰。

二,网络运营商防止类似泄漏的另外一种方法是:实施基于 IRR (Internet Routing Registry)的筛选。IRR 是因特网路由注册表,各网络所有者可以将自己的网段条目添加到这些分布式数据库中。然后,其他网络运营商可以使用这些 IRR 记录,在与同类运营商产生BGP 会话时,生成特定的前缀列表。如果使用了 IRR 过滤,所涉及的任何网络都不会接受更具体的网段前缀。非常令人震惊的是,尽管 IRR 过滤已经存在了 24 年(并且有详细记录),Verizon 没有在其与 Allegheny的 BGP 会话中实施任何此类过滤。IRR 过滤不会增加 Verizon 的任何成本或限制其服务。再次,我们唯一能想到的原因是草率或懒惰。

三,我们去年在全球实施和部署的 RPKI 框架旨在防止此类泄漏。它支持对源网络和网段大小进行过滤。Cloudflare 发布的网段最长不超过 20 位。然后,RPKI 会指示无论路径是什么,都不应接受任何更具体网段前缀。要使此机制发挥作用,网络需要启用 BGP 源验证。AT&T 等许多供应商已在其网络中成功启用此功能

如果 Verizon 使用了 RPKI,他们就会发现播发的路由无效,路由器就能自动丢弃这些路由。

Cloudflare 建议所有网络运营商立即部署 RPKI

Verizon 和某 BGP 优化器如何在今日大范围重创互联网
使用 IRR、RPKI 和前缀限制预防路由泄漏

上述所有建议经充分浓缩后已纳入 MANRS(共同协议路由安全规范

事件解决

Cloudflare 的网络团队联系了涉事网络:AS33154 (DQE Communications) 和 AS701 (Verizon)。联系过程并不顺畅,这可能是由于路由泄露发生在美国东海岸凌晨时间。

Verizon 和某 BGP 优化器如何在今日大范围重创互联网
发送给 Verizon 的电子邮件截图

我们的一位网络工程师迅速联系了 DQE Communications,稍有耽搁之后,他们帮助我们与解决该问题的相关人员取得了联系。DQE 与我们通过电话合作,停止向 Allegheny 播发这些“优化”路线路由。我们对他们的帮助表示感谢。采取此措施后,互联网变回稳定,事态恢复正常。

Verizon 和某 BGP 优化器如何在今日大范围重创互联网
尝试与 DQE 和 Verizon 客服进行沟通的截图

很遗憾,我们尝试通过电子邮件和电话联系 Verizon,但直到撰写本文时(事件发生后超过 8 小时),尚未收到他们的回复,我们也不清楚他们是否正在采取行动以解决问题。

Cloudflare 希望此类事件永不发生,但很遗憾,当前的互联网环境在预防防止此类事件方面作出的努力甚微。现在,业界都应该通过 RPKI 等系统部署更好,更安全的路由的时候。我们希望主要供应商能效仿 Cloudflare、亚马逊和 AT&T,开始验证路由。尤其并且,我们正密切关注 Verizon 并仍在等候其回复。

尽管导致此次服务中断的事件并非我们所能控制,但我们仍对此感到抱歉。我们的团队非常关注我们的服务,在发现此问题几分钟后,即已安排美国、英国、澳大利亚和新加坡的工程师上线解决问题。

13:58

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today [The Cloudflare Blog]

Massive route leak impacts major parts of the Internet, including Cloudflare

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today

What happened?

Today at 10:30UTC, the Internet had a small heart attack. A small company in Northern Pennsylvania became a preferred path of many Internet routes through Verizon (AS701), a major Internet transit provider. This was the equivalent of Waze routing an entire freeway down a neighborhood street — resulting in many websites on Cloudflare, and many other providers, to be unavailable from large parts of the Internet. This should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. To understand why, read on.

We have blogged about these unfortunate events in the past, as they are not uncommon. This time, the damage was seen worldwide. What exacerbated the problem today was the involvement of a “BGP Optimizer” product from Noction. This product has a feature that splits up received IP prefixes into smaller, contributing parts (called more-specifics). For example, our own IPv4 route 104.20.0.0/20 was turned into 104.20.0.0/21 and 104.20.8.0/21. It’s as if the road sign directing traffic to “Pennsylvania” was replaced by two road signs, one for “Pittsburgh, PA” and one for “Philadelphia, PA”. By splitting these major IP blocks into smaller parts, a network has a mechanism to steer traffic within their network but that split should never have been announced to the world at large. When it was it caused today’s outage.

To explain what happened next, here’s a quick summary of how the underlying “map” of the Internet works. “Internet” literally means a network of networks and it is made up of networks called Autonomous Systems (AS), and each of these networks has a unique identifier, its AS number. All of these networks are interconnected using a protocol called Border Gateway Protocol (BGP). BGP joins these networks together and builds the Internet “map” that enables traffic to travel from, say, your ISP to a popular website on the other side of the globe.

Using BGP, networks exchange route information: how to get to them from wherever you are. These routes can either be specific, similar to finding a specific city on your GPS, or very general, like pointing your GPS to a state. This is where things went wrong today.

An Internet Service Provider in Pennsylvania  (AS33154 - DQE Communications) was using a BGP optimizer in their network, which meant there were a lot of more specific routes in their network. Specific routes override more general routes (in the Waze analogy a route to, say, Buckingham Palace is more specific than a route to London).

DQE announced these specific routes to their customer (AS396531 - Allegheny Technologies Inc). All of this routing information was then sent to their other transit provider (AS701 - Verizon), who proceeded to tell the entire Internet about these “better” routes. These routes were supposedly “better” because they were more granular, more specific.

The leak should have stopped at Verizon. However, against numerous best practices outlined below, Verizon’s lack of filtering turned this into a major incident that affected many Internet services such as Amazon,  Linode and Cloudflare.

What this means is that suddenly Verizon, Allegheny, and DQE had to deal with a stampede of Internet users trying to access those services through their network. None of these networks were suitably equipped to deal with this drastic increase in traffic, causing disruption in service. Even if they had sufficient capacity DQE, Allegheny and Verizon were not allowed to say they had the best route to Cloudflare, Amazon, Linode, etc...

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today
BGP leak process with a BGP optimizer

During the incident, we observed a loss, at the worst of the incident, of about 15% of our global traffic.

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today
Traffic levels at Cloudflare during the incident.

How could this leak have been prevented?

There are multiple ways this leak could have been avoided:

A BGP session can be configured with a hard limit of prefixes to be received. This means a router can decide to shut down a session if the number of prefixes goes above the threshold. Had Verizon had such a prefix limit in place, this would not have occurred. It is a best practice to have such limits in place. It doesn't cost a provider like Verizon anything to have such limits in place. And there's no good reason, other than sloppiness or laziness, that they wouldn't have such limits in place.

A different way network operators can prevent leaks like this one is by implementing IRR-based filtering. IRR is the Internet Routing Registry, and networks can add entries to these distributed databases. Other network operators can then use these IRR records to generate specific prefix lists for the BGP sessions with their peers. If IRR filtering had been used, none of the networks involved would have accepted the faulty more-specifics. What’s quite shocking is that it appears that Verizon didn’t implement any of this filtering in their BGP session with Allegheny Technologies, even though IRR filtering has been around (and well documented) for over 24 years. IRR filtering would not have increased Verizon's costs or limited their service in any way. Again, the only explanation we can conceive of why it wasn't in place is sloppiness or laziness.

The RPKI framework that we implemented and deployed globally last year is designed to prevent this type of leak. It enables filtering on origin network and prefix size. The prefixes Cloudflare announces are signed for a maximum size of 20. RPKI then indicates any more-specific prefix should not be accepted, no matter what the path is. In order for this mechanism to take action, a network needs to enable BGP Origin Validation. Many providers like AT&T have already enabled it successfully in their network.

If Verizon had used RPKI, they would have seen that the advertised routes were not valid, and the routes could have been automatically dropped by the router.

Cloudflare encourages all network operators to deploy RPKI now!

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today
Route leak prevention using IRR, RPKI, and prefix limits

All of the above suggestions are nicely condensed into MANRS (Mutually Agreed Norms for Routing Security)

How it was resolved

The network team at Cloudflare reached out to the networks involved, AS33154 (DQE Communications) and AS701 (Verizon). We had difficulties reaching either network, this may have been due to the time of the incident as it was still early on the East Coast of the US when the route leak started.

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today
Screenshot of the email sent to Verizon

One of our network engineers made contact with DQE Communications quickly and after a little delay they were able to put us in contact with someone who could fix the problem. DQE worked with us on the phone to stop advertising these “optimized” routes to Allegheny Technologies Inc. We're grateful for their help. Once this was done, the Internet stabilized, and things went back to normal.

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today
Screenshot of attempts to communicate with the support for DQE and Verizon

It is unfortunate that while we tried both e-mail and phone calls to reach out to Verizon, at the time of writing this article (over 8 hours after the incident), we have not heard back from them, nor are we aware of them taking action to resolve the issue.

At Cloudflare, we wish that events like this never take place, but unfortunately the current state of the Internet does very little to prevent incidents such as this one from occurring. It's time for the industry to adopt better routing security through systems like RPKI. We hope that major providers will follow the lead of Cloudflare, Amazon, and AT&T and start validating routes. And, in particular, we're looking at you Verizon — and still waiting on your reply.

Despite this being caused by events outside our control, we’re sorry for the disruption. Our team cares deeply about our service and we had engineers in the US, UK, Australia, and Singapore online minutes after this problem was identified.

01:00

Using i3 with multiple monitors [Fedora Magazine]

Are you using multiple monitors with your Linux workstation? Seeing many things at once might be beneficial. But there are often much more windows in our workflows than physical monitors — and that’s a good thing, because seeing too many things at once might be distracting. So being able to switch what we see on individual monitors seems crucial.

Let’s talk about i3 — a popular tiling window manager that works great with multiple monitors. And there is one handy feature that many other window managers don’t have — the ability to switch workspaces on individual monitors independently.

Quick introduction to i3

The Fedora Magazine has already covered i3 about three years ago. And it was one of the most popular articles ever published! Even though that’s not always the case, i3 is pretty stable and that article is still very accurate today. So — not to repeat ourselves too much — this article only covers the very minimum to get i3 up and running, and you’re welcome to go ahead and read it if you’re new to i3 and want to learn more about the basics.

To install i3 on your system, run the following command:

$ sudo dnf install i3

When that’s done, log out, and on the log in screen choose i3 as your window manager and log back in again.

When you run i3 for the first time, you’ll be asked if you wish to proceed with automatic configuration — answer yes here. After that, you’ll be asked to choose a “mod key”. If you’re not sure here, just accept the default which sets you Windows/Super key as the mod key. You’ll use this key for mostly all the shortcuts within the window manager.

At this point, you should see a little bar at the bottom and an empty screen. Let’s have a look at some of the basic shortcuts.

Open a terminal using:

$mod + enter

Switch to a second workspace using:

$mod + 2

Open firefox in two steps, first by:

$mod + d

… and then by typing “firefox” and pressing enter.

Move it to the first workspace by:

$mod + shift + 1

… and switch to the first workspace by:

$mod + 1

At this point, you’ll see a terminal and a firefox window side by side. To close a window, press:

$mod + shift + q

There are more shortcuts, but these should give you the minimum to get started with i3.

Ah! And to exit i3 (to log out) press:

$mod + shift + e

… and then confirm using your mouse at the top-right corner.

Getting multiple screens to work

Now that we have i3 up and running, let’s put all those screens to work!

To do that, we’ll need to use the command line as i3 is very lightweight and doesn’t have gui to manage additional screens. But don’t worry if that sounds difficult — it’s actually quite straightforward!

The command we’ll use is called xrandr. If you don’t have xrandr on your system, install it by running:

$ sudo dnf install xrandr

When that’s installed, let’s just go ahead and run it:

$ xrandr

The output lists all the available outputs, and also indicated which have a screen attached to them (a monitor connected with a cable) by showing supported resolutions. Good news is that we don’t need to really care about the specific resolutions to make the them work.

This specific example shows a primary screen of a laptop (named eDP1), and a second monitor connected to the HDMI-2 output, physically positioned right of the laptop. To turn it on, run the following command:

$ xrandr --output HDMI-2 --auto --right-of eDP1

And that’s it! Your screen is now active.

Second screen active. The commands shown on this screenshot are slightly different than in the article, as they set a smaller resolution to make the screenshots more readable.

Managing workspaces on multiple screens

Switching workspaces and creating new ones on multiple screens is very similar to having just one screen. New workspaces get created on the screen that’s currently active — the one that has your mouse cursor on it.

So, to switch to a specific workspace (or to create a new one in case it doesn’t exist), press:

$mod + NUMBER

And you can switch workspaces on individual monitors independently!

Workspace 2 on the left screen, workspace 4 on the right screen.
Left screen switched to workspace 3, right screen still showing workspace 4.
Right screen switched to workspace 4, left screen still showing workspace 3.

Moving workspaces between monitors

The same way we can move windows to different workspaces by the following command:

$mod + shift + NUMBER

… we can move workspaces to different screens as well. However, there is no default shortcut for this action — so we have to create it first.

To create a custom shortcut, you’ll need to open the configuration file in a text editor of your choice (this article uses vim):

$ vim ~/.config/i3/config

And add the following lines to the very bottom of the configuration file:

# Moving workspaces between screens 
bindsym $mod+p move workspace to output right

Save, close, and to reload and apply the configuration, press:

$mod + shift + r

Now you’ll be able to move your active workspace to the second monitor by:

$mod + p
Workspace 2 with Firefox on the left screen
Workspace 2 with Firefox moved to the second screen

And that’s it! Enjoy your new multi-monitor experience, and to learn more about i3, you’re welcome to read the previous article about i3 on the Fedora Magazine, or consult the official i3 documentation.

Sunday, 23 June

Saturday, 22 June

Friday, 21 June

02:00

Making Fedora 30 [Fedora Magazine]

What does it take to make a Linux distribution like Fedora 30? As you might expect, it’s not a simple process.

Changes in Fedora 30

Although Fedora 29 released on October 30, 2018, work on Fedora 30 began long before that. The first change proposal was submitted in late August. By my count, contributors made nine separate change proposals for Fedora 30 before Fedora 29 shipped.

Some of these proposals come early because they have a big impact, like mass removal of Python 2 packages. By the time the proposal deadline arrived in early January, the community had submitted 50 change proposals.

Of course, not all change proposals make it into the shipped release. Some of them are more focused on how we build the release instead of what we release. Others don’t get done in time. System-wide changes must have a contingency plan. These changes are generally evaluated at one of three points in the schedule: when packages branch from Rawhide, at the beginning of the Beta freeze, and at the beginning of the Final freeze. For Fedora 30, 45 Change proposals were still active for the release.

Fedora has a calendar-based release schedule, but that doesn’t mean we ship whatever exists on a given date. We have a set of release criteria that we test against, and we don’t put out a release until all the blockers are resolved. This sometimes means a release is delayed, but it’s important that we ship reliable software.

For the Fedora 30 development cycle, we accepted 22 proposed blocker bugs and rejected 6. We also granted 33 freeze exceptions — bugs that can be fixed during the freeze because they impact the released artifacts or are otherwise important enough to include in the release.

Other contributions

Of course, there’s more to making a release than writing or packaging the code, testing it, and building the images. As with every release, the Fedora Design team created a new desktop background along with several supplemental wallpapers. The Fedora Marketing team wrote release announcements and put together talking points for the Ambassadors and Advocates to use when talking to the broader community.

If you’ve looked at our new website, that was the work of the Websites team in preparation for the Fedora 30 release:

The Documentation Team wrote Release Notes and updated other documentation. Translators provided translations to dozens of languages.

Many other people made contributions to the release of Fedora 30 in some way. It’s not easy to count everyone who has a hand in producing a Linux distribution, but we appreciate every one of our contributors. If you would like to join the Fedora Community but aren’t sure where to start, check out What Can I Do For Fedora?


Photo by Robin Sommer on Unsplash.

Thursday, 20 June

18:00

07:00

Introducing CIRCL: An Advanced Cryptographic Library [The Cloudflare Blog]

Introducing CIRCL: An Advanced Cryptographic Library
Introducing CIRCL: An Advanced Cryptographic Library

As part of Crypto Week 2019, today we are proud to release the source code of a cryptographic library we’ve been working on: a collection of cryptographic primitives written in Go, called CIRCL. This library includes a set of packages that target cryptographic algorithms for post-quantum (PQ), elliptic curve cryptography, and hash functions for prime groups. Our hope is that it’s useful for a broad audience. Get ready to discover how we made CIRCL unique.

Cryptography in Go

We use Go a lot at Cloudflare. It offers a good balance between ease of use and performance; the learning curve is very light, and after a short time, any programmer can get good at writing fast, lightweight backend services. And thanks to the possibility of implementing performance critical parts in Go assembly, we can try to ‘squeeze the machine’ and get every bit of performance.

Cloudflare’s cryptography team designs and maintains security-critical projects. It's not a secret that security is hard. That's why, we are introducing the Cloudflare Interoperable Reusable Cryptographic Library - CIRCL. There are multiple goals behind CIRCL. First, we want to concentrate our efforts to implement cryptographic primitives in a single place. This makes it easier to ensure that proper engineering processes are followed. Second, Cloudflare is an active member of the Internet community - we are trying to improve and propose standards to help make the Internet a better place.

Cloudflare's mission is to help build a better Internet. For this reason, we want CIRCL helps the cryptographic community to create proof of concepts, like the post-quantum TLS experiments we are doing. Over the years, lots of ideas have been put on the table by cryptographers (for example, homomorphic encryption, multi-party computation, and privacy preserving constructions). Recently, we’ve seen those concepts picked up and exercised in a variety of contexts. CIRCL’s implementations of cryptographic primitives creates a powerful toolbox for developers wishing to use them.

The Go language provides native packages for several well-known cryptographic algorithms, such as key agreement algorithms, hash functions, and digital signatures. There are also packages maintained by the community under golang.org/x/crypto that provide a diverse set of algorithms for supporting authenticated encryption, stream ciphers, key derivation functions, and bilinear pairings. CIRCL doesn’t try to compete with golang.org/x/crypto in any sense. Our goal is to provide a complementary set of implementations that are more aggressively optimized, or may be less commonly used but have a good chance at being very useful in the future.

Unboxing CIRCL

Our cryptography team worked on a fresh proposal to augment the capabilities of Go users with a new set of packages.  You can get them by typing:

$ go get github.com/cloudflare/circl

The contents of CIRCL is split across different categories, summarized in this table:

Category Algorithms Description Applications
Post-Quantum Cryptography SIDH Isogeny-based cryptography. SIDH provides key exchange mechanisms using ephemeral keys.
SIKE SIKE is a key encapsulation mechanism (KEM). Key agreement protocols.
Key Exchange X25519, X448 RFC-7748 provides new key exchange mechanisms based on Montgomery elliptic curves. TLS 1.3. Secure Shell.
FourQ One of the fastest elliptic curves at 128-bit security level. Experimental for key agreement and digital signatures.
Digital Signatures Ed25519 RFC-8032 provides new digital signature algorithms based on twisted Edwards curves. Digital certificates and authentication methods.
Hash to Elliptic Curve Groups Several algorithms: Elligator2, Ristretto, SWU, Icart. Protocols based on elliptic curves require hash functions that map bit strings to points on an elliptic curve. Useful in protocols such as Privacy Pass. OPAQUE. PAKE. Verifiable random functions.
Optimization Curve P-384 Our optimizations reduce the burden when moving from P-256 to P-384. ECDSA and ECDH using Suite B at top secret level.

SIKE, a Post-Quantum Key Encapsulation Mechanism

To better understand the post-quantum world, we started experimenting with post-quantum key exchange schemes and using them for key agreement in TLS 1.3. CIRCL contains the sidh package, an implementation of Supersingular Isogeny-based Diffie-Hellman (SIDH), as well as CCA2-secure Supersingular Isogeny-based Key Encapsulation (SIKE), which is based on SIDH.

CIRCL makes playing with PQ key agreement very easy. Below is an example of the SIKE interface that can be used to establish a shared secret between two parties for use in symmetric encryption. The example uses a key encapsulation mechanism (KEM). For our example in this scheme, Alice generates a random secret key, and then uses Bob’s pre-generated public key to encrypt (encapsulate) it. The resulting ciphertext is sent to Bob. Then, Bob uses his private key to decrypt (decapsulate) the ciphertext and retrieve the secret key. See more details about SIKE in this Cloudflare blog.

Let's see how to do this with CIRCL:

// Bob's key pair
prvB := NewPrivateKey(Fp503, KeyVariantSike)
pubB := NewPublicKey(Fp503, KeyVariantSike)

// Generate private key
prvB.Generate(rand.Reader)
// Generate public key
prvB.GeneratePublicKey(pubB)

var publicKeyBytes = make([]array, pubB.Size())
var privateKeyBytes = make([]array, prvB.Size())

pubB.Export(publicKeyBytes)
prvB.Export(privateKeyBytes)

// Encode public key to JSON
// Save privateKeyBytes on disk

Bob uploads the public key to a location accessible by anybody. When Alice wants to establish a shared secret with Bob, she performs encapsulation that results in two parts: a shared secret and the result of the encapsulation, the ciphertext.

// Read JSON to bytes

// Alice's key pair
pubB := NewPublicKey(Fp503, KeyVariantSike)
pubB.Import(publicKeyBytes)

var kem := sike.NewSike503(rand.Reader)
kem.Encapsulate(ciphertext, sharedSecret, pubB)

// send ciphertext to Bob

Bob now receives ciphertext from Alice and decapsulates the shared secret:

var kem := sike.NewSike503(rand.Reader)
kem.Decapsulate(sharedSecret, prvA, pubA, ciphertext)  

At this point, both Alice and Bob can derive a symmetric encryption key from the secret generated.

SIKE implementation contains:

  • Two different field sizes: Fp503 and Fp751. The choice of the field is a trade-off between performance and security.
  • Code optimized for AMD64 and ARM64 architectures, as well as generic Go code. For AMD64, we detect the micro-architecture and if it’s recent enough (e.g., it supports ADOX/ADCX and BMI2 instruction sets), we use different multiplication techniques to make an execution even faster.
  • Code implemented in constant time, that is, the execution time doesn’t depend on secret values.

We also took care of low heap-memory footprint, so that the implementation uses a minimal amount of dynamically allocated memory. In the future, we plan to provide multiple implementations of post-quantum schemes. Currently, our focus is on algorithms useful for key exchange in TLS.

SIDH/SIKE are interesting because the key sizes produced by those algorithms are relatively small (comparing with other PQ schemes). Nevertheless, performance is not all that great yet, so we’ll continue looking. We plan to add lattice-based algorithms, such as NTRU-HRSS and Kyber, to CIRCL. We will also add another more experimental algorithm called cSIDH, which we would like to try in other applications. CIRCL doesn’t currently contain any post-quantum signature algorithms, which is also on our to-do list. After our experiment with TLS key exchange completes, we’re going to look at post-quantum PKI. But that’s a topic for a future blog post, so stay tuned.

Last, we must admit that our code is largely based on the implementation from the NIST submission along with the work of former intern Henry De Valence, and we would like to thank both Henry and the SIKE team for their great work.

Elliptic Curve Cryptography

Elliptic curve cryptography brings short keys sizes and faster evaluation of operations when compared to algorithms based on RSA. Elliptic curves were standardized during the early 2000s, and have recently gained popularity as they are a more efficient way for securing communications.

Elliptic curves are used in almost every project at Cloudflare, not only for establishing TLS connections, but also for certificate validation, certificate revocation (OCSP), Privacy Pass, certificate transparency, and AMP Real URL.

The Go language provides native support for NIST-standardized curves, the most popular of which is P-256. In a previous post, Vlad Krasnov described the relevance of optimizing several cryptographic algorithms, including P-256 curve. When working at Cloudflare scale, little issues around performance are significantly magnified. This is one reason why Cloudflare pushes the boundaries of efficiency.

A similar thing happened on the chained validation of certificates. For some certificates, we observed performance issues when validating a chain of certificates. Our team successfully diagnosed this issue: certificates which had signatures from the P-384 curve, which is the curve that corresponds to the 192-bit security level, were taking up 99% of CPU time! It is common for certificates closer to the root of the chain of trust to rely on stronger security assumptions, for example, using larger elliptic curves. Our first-aid reaction comes in the form of an optimized implementation written by Brendan McMillion that reduced the time of performing elliptic curve operations by a factor of 10. The code for P-384 is also available in CIRCL.

The latest developments in elliptic curve cryptography have caused a shift to use elliptic curve models with faster arithmetic operations. The best example is undoubtedly Curve25519; other examples are the Goldilocks and FourQ curves. CIRCL supports all of these curves, allowing instantiation of Diffie-Hellman exchanges and Edwards digital signatures. Although it slightly overlaps the Go native libraries, CIRCL has architecture-dependent optimizations.

Introducing CIRCL: An Advanced Cryptographic Library

Hashing to Groups

Many cryptographic protocols rely on the hardness of solving the Discrete Logarithm Problem (DLP) in special groups, one of which is the integers reduced modulo a large integer. To guarantee that the DLP is hard to solve, the modulus must be a large prime number. Increasing its size boosts on security, but also makes operations more expensive. A better approach is using elliptic curve groups since they provide faster operations.

In some cryptographic protocols, it is common to use a function with the properties of a cryptographic hash function that maps bit strings into elements of the group. This is easy to accomplish when, for example, the group is the set of integers modulo a large prime. However, it is not so clear how to perform this function using elliptic curves. In cryptographic literature, several methods have been proposed using the terms hashing to curves or hashing to point indistinctly.

The main issue is that there is no general method for deterministically finding points on any elliptic curve, the closest available are methods that target special curves and parameters. This is a problem for implementers of cryptographic algorithms, who have a hard time figuring out on a suitable method for hashing to points of an elliptic curve. Compounding that, chances of doing this wrong are high. There are many different methods, elliptic curves, and security considerations to analyze. For example, a vulnerability on WPA3 handshake protocol exploited a non-constant time hashing method resulting in a recovery of keys. Currently, an IETF draft is tracking work in-progress that provides hashing methods unifying requirements with curves and their parameters.

Corresponding to this problem, CIRCL will include implementations of hashing methods for elliptic curves. Our development is accompanying the evolution of the IEFT draft. Therefore, users of CIRCL will have this added value as the methods implement a ready-to-go functionality, covering the needs of some cryptographic protocols.

Update on Bilinear Pairings

Bilinear pairings are sometimes regarded as a tool for cryptanalysis, however pairings can also be used in a constructive way by allowing instantiation of advanced public-key algorithms, for example, identity-based encryption, attribute-based encryption, blind digital signatures, three-party key agreement, among others.

An efficient way to instantiate a bilinear pairing is to use elliptic curves. Note that only a special class of curves can be used, thus so-called pairing-friendly curves have specific properties that enable the efficient evaluation of a pairing.

Some families of pairing-friendly curves were introduced by Barreto-Naehrig (BN), Kachisa-Schaefer-Scott (KSS), and Barreto-Lynn-Scott (BLS). BN256 is a BN curve using a 256-bit prime and is one of the fastest options for implementing a bilinear pairing. The Go native library supports this curve in the package golang.org/x/crypto/bn256. In fact, the BN256 curve is used by Cloudflare’s Geo Key Manager, which allows distributing encrypted keys around the world. At Cloudflare, high-performance is a must and with this motivation, in 2017, we released an optimized implementation of the BN256 package that is 8x faster than the Go’s native package. The success of these optimizations reached several other projects such as the Ethereum protocol and the Randomness Beacon project.

Recent improvements in solving the DLP over extension fields, GF(pᵐ) for p prime and m>1, impacted the security of pairings, causing recalculation of the parameters used for pairing-friendly curves.

Before these discoveries, the BN256 curve provided a 128-bit security level, but now larger primes are needed to target the same security level. That does not mean that the BN256 curve has been broken, since BN256 gives a security of 100 bits, that is, approximately 2¹⁰⁰ operations are required to cause a real danger, which is still unfeasible with current computing power.

With our CIRCL announcement, we want to announce our plans for research and development to obtain efficient curve(s) to become a stronger successor of BN256. According to the estimation by Barbulescu-Duquesne, a BN curve must use primes of at least 456 bits to match a 128-bit security level. However, the impact on the recalculation of parameters brings back to the main scene BLS and KSS curves as efficient alternatives. To this end a standardization effort at IEFT is in progress with the aim of defining parameters and pairing-friendly curves that match different security levels.

Note that regardless of the curve(s) chosen, there is an unavoidable performance downgrade when moving from BN256 to a stronger curve. Actual timings were presented by Aranha, who described the evolution of the race for high-performance pairing implementations. The purpose of our continuous development of CIRCL is to minimize this impact through fast implementations.

Optimizations

Go itself is a very easy to learn and use for system programming and yet makes it possible to use assembly so that you can stay close “to the metal”. We have blogged about improving performance in Go few times in the past (see these posts about encryption, ciphersuites, and image encoding).

When developing CIRCL, we crafted the code to get the best possible performance from the machine. We leverage the capabilities provided by the architecture and the architecture-specific instructions. This means that in some cases we need to get our hands dirty and rewrite parts of the software in Go assembly, which is not easy, but definitely worth the effort when it comes to performance. We focused on x86-64, as this is our main target, but we also think that it’s worth looking at ARM architecture, and in some cases (like SIDH or P-384), CIRCL has optimized code for this platform.

We also try to ensure that code uses memory efficiently - crafting it in a way that fast allocations on the stack are preferred over expensive heap allocations. In cases where heap allocation is needed, we tried to design the APIs in a way that, they allow pre-allocating memory ahead of time and reuse it for multiple operations.

Security

The CIRCL library is offered as-is, and without a guarantee. Therefore, it is expected that changes in the code, repository, and API occur in the future. We recommend to take caution before using this library in a production application since part of its content is experimental.

As new attacks and vulnerabilities arise over the time, security of software should be treated as a continuous process. In particular, the assessment of cryptographic software is critical, it requires the expertise of several fields, not only computer science. Cryptography engineers must be aware of the latest vulnerabilities and methods of attack in order to defend against them.

The development of CIRCL follows best practices on the secure development. For example, if time execution of the code depends on secret data, the attacker could leverage those irregularities and recover secret keys. In our code, we take care of writing constant-time code and hence prevent timing based attacks.

Developers of cryptographic software must also be aware of optimizations performed by the compiler and/or the processor since these optimizations can lead to insecure binary codes in some cases. All of these issues could be exploited in real attacks aimed at compromising systems and keys. Therefore, software changes must be tracked down through thorough code reviews. Also static analyzers and automated testing tools play an important role on the security of the software.

Summary

CIRCL is envisioned as an effective tool for experimenting with modern cryptographic algorithms yet providing high-performance implementations. Today is marked as the starting point of a continuous machinery of innovation and retribution to the community in the form of a cryptographic library. There are still several other applications such as homomorphic encryption, multi-party computation, and privacy-preserving protocols that we would like to explore.

We are team of cryptography, security, and software engineers working to improve and augment Cloudflare products. Our team keeps the communication channels open for receiving comments, including improvements, and merging contributions. We welcome opinions and contributions! If you would like to get in contact, you should check out our github repository for CIRCL github.com/cloudflare/circl. We want to share our work and hope it makes someone else’s job easier as well.

Finally, special thanks to all the contributors who has either directly or indirectly helped to implement the library - Ko Stoffelen, Brendan McMillion, Henry de Valence, Michael McLoughlin and all the people who invested their time in reviewing our code.

Introducing CIRCL: An Advanced Cryptographic Library

00:50

Critical Firefox vulnerability fixed in 67.0.3 [Fedora Magazine]

On Tuesday, Mozilla issued a security advisory for Firefox, the default web browser in Fedora. This advisory concerns a CVE for a vulnerability based on type confusion that can happen when JavaScript objects are being manipulated. It can be used to crash your browser. There are apparently already attacks in the wild that exploit the issue. Read on for more information, and how to protect your system against this flaw.

At the same time the security vulnerability was issued, Mozilla also released Firefox 67.0.3 (and ESR 60.7.1) to fix the issue.

Updating Firefox in Fedora

Firefox 67.0.3 (with the security fixes) has already been pushed to the stable Fedora repositories. The security fix will be applied to your system with your next update. You can also update the firefox package only by running the following command:

$ sudo dnf update --refresh firefox

This command requires you to have sudo setup. Note that not every Fedora mirrors syncs at the same rate. Community sites graciously donate space and bandwidth these mirrors to carry Fedora content. You may need to try again later if your selected mirror is still awaiting the latest update.

Wednesday, 19 June

07:01

Cloudflare's Ethereum Gateway [The Cloudflare Blog]

Cloudflare's Ethereum Gateway
Cloudflare's Ethereum Gateway

Today, as part of Crypto Week 2019, we are excited to announce Cloudflare's Ethereum Gateway, where you can interact with the Ethereum network without installing any additional software on your computer.

This is another tool in Cloudflare’s Distributed Web Gateway tool set. Currently, Cloudflare lets you host content on the InterPlanetary File System (IPFS) and access it through your own custom domain. Similarly, the new Ethereum Gateway allows access to the Ethereum network, which you can provision through your custom hostname.

This setup makes it possible to add interactive elements to sites powered by Ethereum smart contracts, a decentralized computing platform. And, in conjunction with the IPFS gateway, this allows hosting websites and resources in a decentralized manner, and has the extra bonus of the added speed, security, and reliability provided by the Cloudflare edge network. You can access our Ethereum gateway directly at https://cloudflare-eth.com.

This brief primer on how Ethereum and smart contracts work has examples of the many possibilities of using the Cloudflare Distributed Web Gateway.

Primer on Ethereum

You may have heard of Ethereum as a cryptocurrency. What you may not know is that Ethereum is so much more. Ethereum is a distributed virtual computing network that stores and enforces smart contracts.

So, what is a smart contract?

Good question. Ethereum smart contracts are simply a piece of code stored on the Ethereum blockchain. When the contract is triggered, it runs on the Ethereum Virtual Machine (EVM). The EVM is a distributed virtual machine that runs smart contract code and produces cryptographically verified changes to the state of the Ethereum blockchain as its result.

To illustrate the power of smart contracts, let's consider a little example.

Anna wants to start a VPN provider but she lacks the capital. To raise funds for her venture she decides to hold an Initial Coin Offering (ICO). Rather than design an ICO contract from scratch Anna bases her contract off of ERC-20. ERC-20 is a template for issuing fungible tokens, perfect for ICOs. Anna sends her ERC-20 compliant contract to the Ethereum network, and starts to sell stock in her new company, VPN Co.

Cloudflare's Ethereum Gateway

Once she's sorted out funds, Anna sits down and starts to write a smart contract. Anna’s contract asks customers to send her their public key, along with some Ether (the coin product of Ethereum). She then authorizes the public key to access her VPN service. All without having to hold any secret information. Huzzah!

Next, rather than set up the infrastructure to run a VPN herself, Anna decides to use the blockchain again, but this time as a customer. Cloud Co. sells managed cloud infrastructure using their own smart contract. Anna programs her contract to send the appropriate amount of Ether to Cloud Co.'s contract. Cloud Co. then provisions the servers she needs to host her VPN. By automatically purchasing more infrastructure every time she has a new customer, her VPN company can scale totally autonomously.

Cloudflare's Ethereum Gateway

Finally, Anna pays dividends to her investors out of the profits, keeping a little for herself.

Cloudflare's Ethereum Gateway

And there you have it.

A decentralised, autonomous, smart VPN provider.

A smart contract stored on the blockchain has an associated account for storing funds, and the contract is triggered when someone sends Ether to that account. So for our VPN example, the provisioning contract triggers when someone transfers money into the account associated with Anna’s contract.

What distinguishes smart contracts from ordinary code?

The "smart" part of a smart contract is they run autonomously. The "contract" part is the guarantee that the code runs as written.

Because this contract is enforced cryptographically, maintained in the tamper-resistant medium of the blockchain and verified by the consensus of the network, these contracts are more reliable than regular contracts which can provoke dispute.

Ethereum Smart Contracts vs. Traditional Contracts

A regular contract is enforced by the court system, litigated by lawyers. The outcome is uncertain; different courts rule differently and hiring more or better lawyers can swing the odds in your favor.

Smart contract outcomes are predetermined and are nearly incorruptible. However, here be dragons: though the outcome can be predetermined and incorruptible, a poorly written contract might not have the intended behavior, and because contracts are immutable, this is difficult to fix.

How are smart contracts written?

You can write smart contracts in a number of languages, some of which are Turing complete, e.g. Solidity. A Turing complete language lets you write code that can evaluate any computable function. This puts Solidity in the same class of languages as Python and Java. The compiled bytecode is then run on the EVM.

The EVM differs from a standard VM in a number of ways:

The EVM is distributed

Each piece of code is run by numerous nodes. Nodes verify the computation before accepting a block, and therefore ensure that miners who want their blocks accepted must always run the EVM honestly. A block is only considered accepted when more than half of the network accepts it. This is the consensus part of Ethereum.

The EVM is entirely deterministic

This means that the same inputs to a function always produce the same outputs. Because regular VMs have access to file storage and the network, the results of a function call can be non-deterministic. Every EVM has the same start state, thus a given set of inputs always gives the same outputs. This makes the EVM more reliable than a standard VM.

There are two big gotchas that come with this determinism:

  • EVM bytecode is Turing complete and therefore discerning the outputs without running the computation is not always possible.
  • Ethereum smart contracts can store state on the blockchain. This means that the output of the function can vary as the blockchain changes. Although, technically this is deterministic in that the blockchain is an input to the function, it may still be impossible to derive the output in advance.

This however means that they suffer from the same problems as any piece of software – bugs. However, unlike normal code where the authors can issue a patch, code stored on the blockchain is immutable. More problematically, even if the author provides a new smart contract, the old one is always still available on the blockchain.

This means that when writing contracts authors must be especially careful to write secure code, and include a kill switch to ensure that if bugs do reside in the code, they can be squashed. If there is no kill switch and there are vulnerabilities in the smart contract that can be exploited, it can potentially lead to the theft of resources from the smart contract or from other individuals. EVM Bytecode includes a special SELFDESTRUCT opcode that deletes a contract, and sends all funds to the specified address for just this purpose.

The need to include a kill switch was brought into sharp focus during the infamous DAO incident. The DAO smart contract acted as a complex decentralized venture capital (VC) fund and held Ether worth $250 million at its peak collected from a group of investors. Hackers exploited vulnerabilities in the smart contract and stole Ether worth $50 million.

Because there is no way to undo transactions in Ethereum, there was a highly controversial “hard fork,” where the majority of the community agreed to accept a block with an “irregular state change” that essentially drained all DAO funds into a special “WithdrawDAO” recovery contract. By convincing enough miners to accept this irregular block as valid, the DAO could return funds.

Not everyone agreed with the change. Those who disagreed rejected the irregular block and formed the Ethereum Classic network, with both branches of the fork growing independently.

Kill switches, however, can cause their own problems. For example, when a contract used as a library flips its kill switch, all contracts relying on this contract can no longer operate as intended, even though the underlying library code is immutable. This caused over 500,000 ETH to become stuck in multi-signature wallets when an attacker triggered the kill switch of an underlying library.

Users of the multi-signature library assumed the immutability of the code meant that the library would always operate as anticipated. But the smart contracts that interact with the blockchain are only deterministic when accounting for the state of the blockchain.

In the wake of the DAO, various tools were created that check smart contracts for bugs or enable bug bounties, for example Securify and The Hydra.

Cloudflare's Ethereum Gateway
Come here, you ...

Another way smart contracts avoid bugs is using standardized patterns. For example, ERC-20 defines a standardized interface for producing tokens such as those used in ICOs, and ERC-721 defines a standardized interface for implementing non-fungible tokens. Non-fungible tokens can be used for trading-card games like CryptoKitties. CryptoKitties is a trading-card style game built on the Ethereum blockchain. Players can buy, sell, and breed cats, with each cat being unique.

CryptoKitties is built on a collection of smart contracts that provides an open-source Application Binary Interface (ABI) for interacting with the KittyVerse -- the virtual world of the CryptoKitties application. An ABI simply allows you to call functions in a contract and receive any returned data. The KittyBase code may look like this:

Contract KittyBase is KittyAccessControl {
	event Birth(address owner, uint256 kittyId, uint256 matronId, uint256 sireId, uint256 genes);
	event Transfer(address from, address to, uint256 tokenId);
    struct Kitty {
        uint256 genes;
        uint64 birthTime;
        uint64 cooldownEndBlock;
        uint32 matronId;
        uint32 sireId;
        uint32 siringWithId;
        uint16 cooldownIndex;
        uint16 generation;
    }
	[...]
    function _transfer(address _from, address _to, uint256 _tokenId) internal {
    ...
    }
    function _createKitty(uint256 _matronId, uint256 _sireId, uint256 _generation, uint256 _genes, address _owner) internal returns (uint) {
    ...
    }
	[...]
}

Besides defining what a Kitty is, this contract defines two basic functions for transferring and creating kitties. Both are internal and can only be called by contracts that implement KittyBase. The KittyOwnership contract implements both ERC-721 and KittyBase, and implements an external transfer function that calls the internal _transfer function. This code is compiled into bytecode written to the blockchain.

By implementing a standardised interface like ERC-721, smart contracts that aren’t specifically aware of CryptoKitties can still interact with the KittyVerse. The CryptoKitties ABI functions allow users to create distributed apps (dApps), of their own design on top of the KittyVerse, and allow other users to use their dApps. This extensibility helps demonstrate the potential of smart contracts.

How is this so different?

Smart contracts are, by definition, public. Everyone can see the terms and understand where the money goes. This is a radically different approach to providing transparency and accountability. Because all contracts and transactions are public and verified by consensus, trust is distributed between the people, rather than centralized in a few big institutions.

The trust given to institutions is historic in that we trust them because they have previously demonstrated trustworthiness.

The trust placed in consensus-based algorithms is based on the assumption that most people are honest, or more accurately, that no sufficiently large subset of people can collude to produce a malicious outcome. This is the democratisation of trust.

In the case of the DAO attack, a majority of nodes agreed to accept an “irregular” state transition. This effectively undid the damage of the attack and demonstrates how, at least in the world of blockchain, perception is reality. Because most people “believed” (accepted) this irregular block, it became a “real,” valid block. Most people think of the blockchain as immutable, and trust the power of consensus to ensure correctness, however if enough people agree to do something irregular, they don't have to keep the rules.

So where does Cloudflare fit in?

Accessing the Ethereum network and its attendant benefits directly requires running complex software, including downloading and cryptographically verifying hundreds of gigabytes of data, which apart from producing technical barriers to entry for users, can also exclude people with low-power devices.

To help those users and devices access the Ethereum network, the Cloudflare Ethereum gateway allows any device capable of accessing the web to interact with the Ethereum network in a safe, reliable way.

Through our gateway, not only can you explore the blockchain, but if you give our gateway a signed transaction, we’ll push it to the network to allow miners to add it to their blockchain. This means that you can send Ether and even put new contracts on the blockchain without having to run a node.

"But Jonathan," I hear you say, "by providing a gateway aren't you just making Cloudflare a centralizing institution?"

That’s a fair question. Thankfully, Cloudflare won’t be alone in offering these gateways. We’re joining alongside organizations, such as Infura, to expand the constellation of gateways that already exist. We hope that, by providing a fast, reliable service, we can enable people who never previously used smart-contracts to do so, and in so doing bring the benefits they offer to billions of regular Internet users.

"We're excited that Cloudflare is bringing their infrastructure expertise to the Ethereum ecosystem. Infura has always believed in the importance of standardized, open APIs and compatibility between gateway providers, so we look forward to collaborating with their team to build a better distributed web." - E.G. Galano, Infura co-founder.

By providing a gateway to the Ethereum network, we help users make the jump from general web-user to cryptocurrency native, and eventually make the distributed web a fundamental part of the Internet.

What can you do with Cloudflare's Gateway?

Visit cloudflare-eth.com to interact with our example app. But to really explore the Ethereum world, access the RPC API, where you can do anything that can be done on the Ethereum network itself, from examining contracts, to transferring funds.

Our Gateway accepts POST requests containing JSON. For a complete list of calls, visit the Ethereum github page. So, to get the block number of the most recent block, you could run:

curl https://cloudflare-eth.com -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'

and you would get a response something like this:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": "0x780f17"
}

We also invite developers to build dApps based on our Ethereum gateway using our API. Our API allows developers to build websites powered by the Ethereum blockchain. Check out developer docs to get started. If you want to read more about how Ethereum works check out this deep dive.

The architecture

Cloudflare is uniquely positioned to host an Ethereum gateway, and we have the utmost faith in the products we offer to customers. This is why the Cloudflare Ethereum gateway runs as a Cloudflare customer and we dogfood our own products to provide a fast and reliable gateway. The domain we run the gateway on (https://cloudflare-eth.com) uses Cloudflare Workers to cache responses for popular queries made to the gateway. Responses for these queries are answered directly from the Cloudflare edge, which can result in a ~6x speed-up.

We also use Load balancing and Argo Tunnel for fast, redundant, and secure content delivery. With Argo Smart Routing enabled, requests and responses to our Ethereum gateway are tunnelled directly from our Ethereum node to the Cloudflare edge using the best possible routing.

Cloudflare's Ethereum Gateway

Similar to our IPFS gateway, cloudflare-eth.com is an SSL for SaaS provider. This means that anyone can set up the Cloudflare Ethereum gateway as a backend for access to the Ethereum network through their own registered domains. For more details on how to set up your own domain with this functionality, see the Ethereum tab on cloudflare.com/distributed-web-gateway.

With these features, you can use Cloudflare’s Distributed Web Gateway to create a fully decentralized website with an interactive backend that allows interaction with the IPFS and Ethereum networks. For example, you can host your content on IPFS (using something like Pinata to pin the files), and then host the website backend as a smart contract on Ethereum. This architecture does not require a centralized server for hosting files or the actual website. Added to the power, speed, and security provided by Cloudflare’s edge network, your website is delivered to users around the world with unparalleled efficiency.

Embracing a distributed future

At Cloudflare, we support technologies that help distribute trust. By providing a gateway to the Ethereum network, we hope to facilitate the growth of a decentralized future.

We thank the Ethereum Foundation for their support of a new gateway in expanding the distributed web:

“Cloudflare's Ethereum Gateway increases the options for thin-client applications as well as decentralization of the Ethereum ecosystem, and I can't think of a better person to do this work than Cloudflare. Allowing access through a user's custom hostname is a particularly nice touch. Bravo.” - Dr. Virgil Griffith, Head of Special Projects, Ethereum Foundation.

We hope that by allowing anyone to use the gateway as the backend for their domain, we make the Ethereum network more accessible for everyone; with the added speed and security brought by serving this content directly from Cloudflare’s global edge network.

So, go forth and build our vision – the distributed crypto-future!

Cloudflare's Ethereum Gateway

02:00

Get the latest Ansible 2.8 in Fedora [Fedora Magazine]

Ansible is one of the most popular automation engines in the world. It lets you automate virtually anything, from setup of a local system to huge groups of platforms and apps. It’s cross platform, so you can use it with all sorts of operating systems. Read on for more information on how to get the latest Ansible in Fedora, some of its changes and improvements, and how to put it to use.

Releases and features

Ansible 2.8 was recently released with many fixes, features, and enhancements. It was available in Fedora mere days afterward as an official update in Fedora 29 and 30, as well as EPEL. The follow-on version 2.8.1 released two weeks ago. Again, the new release was available within a few days in Fedora.

Installation is, of course, easy to do from the official Fedora repositories using sudo:

$ sudo dnf -y install ansible

The 2.8 release has a long list of changes, and you can read them in the Porting Guide for 2.8. But they include some goodies, such as Python interpreter discovery. Ansible 2.8 now tries to figure out which Python is preferred by the platform it runs on. In cases where that fails, Ansible uses a fallback list. However, you can still use a variable ansible_python_interpreter to set the Python interpreter.

Another change makes Ansible more consistent across platforms. Since sudo is more exclusive to UNIX/Linux, and other platforms don’t have it, become is now used in more places. This includes command line switches. For example, –ask-sudo-pass has become –ask-become-pass, and the prompt is now BECOME password: instead.

There are many more features in the 2.8 and 2.8.1 releases. Do check out the official changelog on GitHub for all the details.

Using Ansible

Maybe you’re not sure if Ansible is something you could really use. Don’t worry, you might not be alone in thinking that, because it’s so powerful. But it turns out that it’s not hard to use it even for simple or individual setups like a home with a couple computers (or even just one!).

We covered this topic earlier in the Fedora magazine as well:

Give Ansible a try and see what you think. The great part about it is that Fedora stays quite up to date with the latest releases. Happy automating!