Monday, 09 December

07:00

Metasploit for drones? Best of luck with that, muses veteran tinkerer [The Register]

Been down this path and it ain't that easy, says man who knows

Black Hat Europe  A veteran drone hacker reckons the recent release of the Dronesploit framework won't go down quite as its inventors hope.…

06:20

Behuld – zee-a internet ouff tuilet tissuoe at Meecrusufft Sveden. Børk børk børk! [The Register]

Windows giant shows off smart-metered, connected office in Stockholm

The Register took a trip to Microsoft's shiny new Stockholm HQ to check out what the company's employees have to look forward to over the next decade - and came away more informed about smart metred loo roll.…

05:40

Two can play that game: China orders ban on US computers and software [The Register]

Who needs who more?

China has ordered all government offices to start ripping out non-Chinese computers and software in order to bolster domestic manufacturers and suppliers.…

05:38

WireGuard Lands In Net-Next While It Waits For Inclusion In Linux 5.6 [Phoronix]

The WireGuard secure VPN tunnel kernel code has landed in net-next! This means that -- barring any major issues coming to light that would lead to a revert -- WireGuard will finally reach the mainline kernel with the Linux 5.6 cycle kicking off in late January or early February!..

05:32

ADriConf GUI Control Panel Support For Mesa Vulkan Drivers Is Brought Up [Phoronix]

One of the most frequent complaints we hear from Linux gamers running open-source GPU drivers is over the lack of the hardware vendors supporting any feature-rich control panels like they do on Windows. There are many Linux driver tunables exposed by these open-source graphics drivers, but often they can only be manipulated via command-line options, environment variables, boot parameters, and other less than straight-forward means especially for recent converts from Windows and other novice Linux users. ADriConf has been doing a fairly decent job as a third-party means of helping to improve the situation and now there is talk of it supporting Vulkan driver settings...

05:21

Facebook's New Linux Slab Memory Controller Saving 30~40%+ Of Memory, Less Fragmentation [Phoronix]

Back in September we wrote about Facebook's Roman Gushchin working on a new slab memory controller/allocator implementation that in turn could provide better memory utilization and less slab memory usage. This wasn't ready in time for the 5.5 kernel but a revised patch series was sent out last week...

05:10

Fedora Prepares To Roll-Out Linux 5.4 Kernel Update But Needs Help Testing [Phoronix]

Fedora users eager to see the Linux 5.4 stable kernel can engage by helping to test their newly-spun 5.4-based kernel image prior to it officially landing as a stable release update...

05:00

David Phillips, godfather of UK tech distribution industry, dies aged 74 [The Register]

Northamber founder passes after 'short illness'

Obit  The Brit tech industry has lost one of its founding fathers, a literal and metaphorical giant who pioneered hardware and software distribution. David Michael Phillips, Northamber's founder and – until recently – its long-serving chairman, has died.…

04:34

How Pranksters Tricked Twitter-Scraping Sites Into Copyright Infringement [Slashdot]

An anonymous reader shares a remarkable story from Fortune's Data Sheet newsletter: The story begins on Dec. 3, when an artist going by @Hannahdouken on Twitter posted an image of hand-drawn text reading, "This site sells STOLEN Artwork, do NOT buy from them!" And asked followers to reply that they wanted the image on a shirt. They were testing a theory. For years, artists posting their work online have found the art turned into t-shirts and other merch without permission or compensation. The theory was that this was being done by automated bots that combed Twitter for images with such enthusiastic replies, and then automatically created merch on sites such as Gearbubble, copthistee, and Teeshirtpublic... Sure enough, automated bots picked up @Hannahdouken's image and placed it on t-shirts... They report that other Twitter users then took the stunt even further, including one who "had a theory: See if he could bait the bots into copyright infringement, and just maybe, a pricey lawsuit." So they produced a drawing of a particularly sassy Mickey Mouse with the caption "This is NOT a parody. We committed copyright infringement and want to be sued by Disney." His version of the stunt succeeded spectacularly. First, the bots came out of the woodwork, drawn by hundreds of tweets from people saying they wanted the image on a t-shirt. Then other artists repeated the trick with infringing images including Pikachu, Mario, and the Coca-Cola logo....

Read more of this story at Slashdot.

04:18

Things Microsoft will be glad to never see again: Windows 10 1809 and Windows Phone Office [The Register]

New builds, Project Scarlett and much more

Roundup  New builds, a prolonged farewell to old friends and a new toy (for the boss, at least). That's right, it's the past week at Microsoft brought to you in bite-sized chunks by El Reg.…

03:00

Gee, S/4HANA. Just what I always wanted: Customers are wary of what's in SAP's sack [The Register]

So the decades we invested in the last platform mean nothing?

The lure of shiny new things is particularly irrepressible in December as Christmas approaches, but SAP customers seem to be able to resist it.…

02:18

Linus Rejects "Size Of Member" Change From Linux 5.5 Kernel [Phoronix]

This weekend was the last-minute pull request by Google's Kees Cook to introduce the new sizeof_member() macro that had been previously rejected from Linux 5.4. Well, it was again rejected by Linus Torvalds prior to tagging the Linux 5.5-rc1 kernel...

02:10

Here's a bit of Intel for you: Neri a day goes by that HPE doesn't feel CPU shortage pinch [The Register]

Not just server pain, PCs too, says new CEO

Hewlett Packard Enterprise is feeling the effects of Intel shortages in the server market, the company CEO has told us.…

02:00

Contribute at the Fedora Test Week for Kernel 5.4 [Fedora Magazine]

The kernel team is working on final integration for kernel 5.4. This version was just recently released, and will arrive soon in Fedora. This version has many security fixes included. As a result, the Fedora kernel and QA teams have organized a test week from Monday, December 09, 2019 through Monday, December 16, 2019. Refer to the wiki page for links to the test images you’ll need to participate. Read below for details.

How does a test week work?

A test day/week is an event where anyone can help make sure changes in Fedora work well in an upcoming release. Fedora community members often participate, and the public is welcome at these events. If you’ve never contributed before, this is a perfect way to get started.

To contribute, you only need to be able to do the following things:

  • Download test materials, which include some large files
  • Read and follow directions step by step

The wiki page for the kernel test day has a lot of good information on what and how to test. After you’ve done some testing, you can log your results in the test day web application. If you’re available on or around the day of the event, please do some testing and report your results.

Happy testing, and we hope to see you in the Test Week.

01:34

Free Software Foundation Offers Benefits and Merchandise In Its Annual Fundraiser [Slashdot]

An anonymous reader writes: The Free Software Foundation is holding its annual fundraiser, with a goal of attracting 600 new members by the end of December. (New members so far: 112.) "We are still fighting the oppressive nature of proprietary software," explains the campaign's web page. "We have made solid inroads, and the community is as passionate as ever." As a 501(c)(3) charity the group's membership dues are all tax deductible, and associate memberships are just $10 a month ($5 for students). They come with special benefits including up to five email aliases in the member.fsf.org domain, eligibility to join the nonprofit Digital Credit Union, free admission to the annual LibrePlanet conference in Boston, and 20% discounts on FSF merchandise and GNU gear (including this delightful stuffed baby gnu). And for its special year-end fundraiser, different levels are also eligible for patches, backpacks, a thermos, and a public thank you at gnu.org. "With your things neatly organized in a backpack covered with patches, and coffee forever to go, you will be ready to fight for freedom!" And finally, they've also created images to share on social media, writing thta "It is not always easy to explain to your neighbor or friend what free software is, or why it is so important. But taking the time to explain it, and motivating the people in your community to think critically about how much control they actually have over their software is the only way to keep our community growing and counter the billions of dollars that proprietary software companies use to strip our user rights."

Read more of this story at Slashdot.

01:12

Remember the Dutch kid who stuck his finger in a dam to save the village? Here's the IT equivalent [The Register]

It only took colleagues an hour to notice our hero was missing

Who, Me?  Welcome to back Who, Me?, The Register's weekly dip into the bottomless pool of cunning and calamity supplied by readers who have, in a real sense, been there and most definitely done that.…

00:01

US Homeland Security backs off on scanning US citizens, Amazon ups AI ante, and more [The Register]

And why China might not be as big as first thought in AI spending

Roundup  Hello, welcome to this week's machine learning musings. We bring you news about the hottest topics in AI: Facial recognition, the so-called AI arms race between the US and China, and erm, GPUs in the cloud.…

Sunday, 08 December

22:34

Former Oracle Product Manager Claims He Was Forced Out For Refusing to Sell Vaporware [Slashdot]

A former Oracle employee filed a lawsuit against the database giant on Tuesday claiming that he was forced out for refusing to lie about the functionality of the company's software. The civil complaint, filed on behalf of plaintiff Tayo Daramola in U.S. District Court in San Francisco, contends that Oracle violated whistleblower protections under the Sarbanes-Oxley Act and the Dodd-Frank Act, the RICO Act, and the California Labor Code. According to the court filing, Daramola, a resident of Montreal, Canada, worked for Oracle's NetSuite division from November 30, 2016 through October 13, 2017. He served as a project manager for an Oracle cloud service known as the Cloud Campus BookStore initiative and dealt with US customers. Campus bookstores, along with ad agencies, and apparel companies are among the market segments targeted by Oracle and NetSuite. Daramola's clients are said to have included the University of Washington, the University of Oregon, the University of Texas at Austin, Brigham Young University and the University of Southern California. The problem, according to the complaint, is that Oracle was asking Daramola to sell vaporware -- a charge the company denies. "Daramola gradually became aware that a large percentage of the major projects to which he was assigned were in 'escalation' status with customers because Oracle had sold his customers software products it could not deliver, and that were not functional," the complaint says. Daramola realized that his job "was to ratify and promote Oracle's repeated misrepresentations to customers" about the capabilities of its software, "under the premise of managing the customer's expectations." The ostensible purpose of stringing customers along in this manner was to buy time so Oracle could actually implement the capabilities it was selling, the court filing states. As Daramola saw it, his job as project manager thus required him to participate "in a process of affirmative misrepresentation, material omission, and likely fraud." "We don't agree with the allegations," Oracle told The Register "and intend to vigorously defend the matter." The article also notes that in 2016 Oracle faced another whistleblower lawsuit, this one brought by a former senior finance manager at Oracle who'd said her bosses directed her to inflate the company's cloud sales. Oracle settled that lawsuit "while denying any wrongdoing."

Read more of this story at Slashdot.

22:11

WireGuard Sends Out Latest Patch Revision In Preparing For Linux 5.6 [Phoronix]

While there are some pretty great features for Linux 5.5, one that didn't make it quite in time was the long-awaited introduction of WireGuard as the in-kernel secure VPN tunnel. While it was a bummer it didn't make 5.5, all indications are at this point is that it will be in Linux 5.6...

19:34

Disney Warns 'Star Wars: The Rise of Skywalker' Effects Could Cause Seizures [Slashdot]

"The Walt Disney Co. is asking exhibitors worldwide to warn moviegoers that Star Wars: The Rise of Skywalker may pose a seizure risk to audience members with photosensitive epilepsy," reports Deadline: In an unusual move, Disney has sent a letter to theater owners and operators worldwide with a recommendation that special steps should be taken to alert moviegoers about the visual effects and flashing lights in the J.J. Abrams-directed interstellar adventure. "Out of an abundance of caution," the letter opens, "we recommend that you provide at your venue box office and online, and at other appropriate places where your customers will see it, a notice containing the following information: Star Wars: The Rise of Skywalker contains several sequences with imagery and sustained flashing lights that may affect those who are susceptible to photosensitive epilepsy or have other photosensitivities." The Burbank-based Disney is also working with the Epilepsy Foundation, which issued an advisory of its own and commended the studio for taking the initiative on the audience safety issue. About 3.4 million Americans have epilepsy and about three percent have photosensitivity issues that puts them at risk of seizures triggered by flashing lights or other visual patterns.

Read more of this story at Slashdot.

17:34

Ohio Neighborhood Temporarily Evacuated Over Misplaced Fears of a Homemade Nuclear Reactor [Slashdot]

"A 911 call Thursday led to a precautionary evacuation of an entire street in a Northwest Side neighborhood in Columbus over concerns about a possible small nuclear reactor and alpha waves reported by a resident who said he sustained burns in his garage on the device," acocrding to the Columbus Dispatch. Slashdot reader k6mfw shared their report: In the end, authorities found no hazard. The man will undergo a mental-health examination and may face charges of inducing a panic. The man, who is in his late 20s or early 30s and who resides on the 6300 block of Chippenhook Court, called 911 about 6:15 p.m. and reported he had been sustained burns from a device he was working on in his garage. Battalion Chief Steve Martin, the Columbus Fire Division's media spokesman, said the man's description of the device suggested he was working on a small nuclear reactor and included references to a particle accelerator and alpha waves. The latter reference led to concerns about potential radiation, he said. Hazmat, bomb squad and other emergency responders -- operating out of an abundance of caution -- evacuated the approximately 40 residences on the cul-de-sac street in the Cranston Commons development while they assessed the situation, Martin said... He said medics determined the man did not appear to be injured, at least not seriously. Radiation level checks were conducted on the man and then at the residence and nothing was found, Martin said. A nuclear specialist brought to the scene found in the garage what was identified as a homemade capacitor, Martin said. A capacitor is a device that consists of two or more separate conducting plates and is used to store an electric charge, not unlike a battery. After it was determined there was no threat, residents were allowed to return to their homes at 9:20 p.m. Depending on the evaluation and further investigation, it is possible the man will be criminally charged with inducing a panic, Martin said. Only one injury was reported: a firefighter in a hazmat suit was injured when he unepectedly came off a curb and twisted his ankle, Martin said.

Read more of this story at Slashdot.

16:55

Linux 5.5-rc1 Kernel Released With 12,500+ Commits [Phoronix]

Linus Torvalds has just issued the first release candidate of the Linux 5.5 cycle following the traditional two week long merge window...

16:39

Linux 5.5 Feature Overview - Raspberry Pi 4 To New Graphics Capabilities To KUnit [Phoronix]

Linux 5.5-rc1 is on the way to mirrors and with that the Linux 5.5 merge window is now over. Here is a look at the lengthy set of changes and new features for this next Linux kernel that will debut as stable in early 2020.

16:34

Astronomers Find the Biggest Black Hole Ever Measured [Slashdot]

"Astronomers have found the biggest black hole ever measured -- it's 40 billion times the sun's mass, or roughly two-thirds the mass of all stars in the Milky Way," writes Astronomy.com. Iwastheone shares their report: The gargantuan black hole lurks in a galaxy that's supermassive itself and probably formed from the collisions of at least eight smaller galaxies. Holm 15A is a huge elliptical galaxy at the center of a cluster of galaxies called Abell 85... When two spiral galaxies -- like our Milky Way and the nearby Andromeda Galaxy -- collide, they can merge and form an elliptical galaxy. In crowded environments like galaxy clusters, these elliptical galaxies can collide and merge again to form an even larger elliptical galaxy. Their central black holes combine as well and make larger black holes, which can kick huge swaths of nearby stars out to the edges of the newly formed galaxy. The resulting extra-large elliptical galaxy usually doesn't have much gas from which to form new stars, so its center looks pretty bare after its black hole kicks out nearby stars. Astronomers call these huge elliptical galaxies with faint centers "cored galaxies." Massive cored galaxies often sit in the centers of galaxy clusters. The authors of the new study found that Holm 15A, the enormous galaxy at the center of its home galaxy cluster, must have formed from yet another merger of two already-huge cored elliptical galaxies. That would mean Holm 15A probably formed from the combination of eight smaller spiral galaxies over billions of years... This series of mergers also created the black hole in its center, a monster about as big as our solar system but with the mass of 40 billion suns. One of the study's authors says their discovery finally confirms the current theory about how quasars work.

Read more of this story at Slashdot.

15:34

Chinese Newspaper Touts Videogame Where Players 'Hunt Down Traitors' in Hong Kong [Slashdot]

The Global Times is a daily tabloid newspaper published "under the auspices" of the Chinese Communist Party's People's Daily, according to Wikipedia. And this week Slashdot reader Tulsa_Time noticed that this official state-run newspaper "promoted a video game where users are tasked with hunting down the 'traitors' leading Hong Kong's ongoing pro-democracy demonstrations." Here's an excerpt from the article by China's state-run newspaper: An online game calling on players to hunt down traitors who seek to separate Hong Kong from China and fuel street violence has reportedly begun to attract players across Chinese mainland social media platforms. The game, "fight the traitors together," is set against the backdrop of the social unrest that has persisted in Hong Kong. The script asks the player to find eight secessionists hidden in the crowd participating in Hong Kong protests. Players can knock them down with slaps or rotten eggs until they are captured. Online gamers claim the game allows them to vent their anger at the separatist behavior of secessionists during the recent Hong Kong riots. The eight traitors in the game, caricatured based on real people, include Jimmy Lai Chee-ying, Martin Lee Chu-ming and Joshua Wong Chi-fung, prominent opposition figures who have played a major role in inciting unrest in Hong Kong. There are also traitor figures in ancient China... In the game, amid a crowd of black-clad rioters wearing yellow hats and face masks, Anson Chan Fang On-sang, another leading opposition figure, carries a bag with a U.S. flag, clutches a stack of U.S. dollars and holds a loudspeaker to incite violence in the streets.

Read more of this story at Slashdot.

14:38

Debian Begins Vote On Supporting Non-Systemd Init Options [Slashdot]

"It's been five years already since the vote to transition to systemd in Debian over Upstart," reports Phoronix, noting that the Debian developer community has now begun a 20-day ranked-choice vote on eight different proposals for "'init system diversity' and just how much Debian developers care (or not) in supporting alternatives to systemd." The eight options they're voting on: Choice 1: F: Focus on systemd Choice 2: B: Systemd but we support exploring alternatives Choice 3: A: Support for multiple init systems is Important Choice 4: D: Support non-systemd systems, without blocking progress Choice 5: H: Support portability, without blocking progress Choice 6: E: Support for multiple init systems is Required Choice 7: G: Support portability and multiple implementations Choice 8: Further Discussion There's detailed descriptions of each option on the Debian developers mailing list. "This is a non-secret vote," the post explains. "After the voting period is over the details on who voted what will be published."

Read more of this story at Slashdot.

13:34

Linux Users Can Now Use Disney+ After DRM Fix [Slashdot]

"Linux users can now stream shows and movies from the Disney+ streaming service after Disney lowered the level of their DRM requirements," reports Bleeping Computer: When Disney+ was first launched, Linux users who attempted to watch shows and movies were shown an error stating "Something went wrong. Please try again. If the problem persists, visit the Disney+ Help Center (Error Code 83)." As explained by Hans de Goede, this error was being caused by the Disney+ service using the highest level of security for the Widevine Digital Rights Management (DRM) technology. As some Linux and Android devices did not support this higher DRM security level, they were unable to stream Disney+ shows in their browsers... Yesterday, Twitter users discovered that Disney+ had suddenly started working on Linux browsers after the streaming service tweaked their DRM security levels... Even with Disney+ lowering the DRM requirements, users must first make sure DRM is enabled in the browser. For example, Disney+ will not work with Firefox unless you enable the "Play DRM-controlled content" setting in the browser.

Read more of this story at Slashdot.

12:34

20 Low-End VPS Providers Suddenly Shutting Down In a 'Deadpooling' Scam [Slashdot]

"At least 20 web hosting providers have hastily notified customers today, Saturday, December 7, that they plan to shut down on Monday, giving their clients two days to download data from their accounts before servers are shut down and wiped clean," reports ZDNet. And no refunds are being provided: All the services offer cheap low-end virtual private servers [and] all the websites feature a similar page structure, share large chunks of text, use the same CAPTCHA technology, and have notified customers using the same email template. All clues point to the fact that all 20 websites are part of an affiliate scheme or a multi-brand business ran by the same entity... As several users have pointed out, the VPS providers don't list physical addresses, don't list proper business registration information, and have no references to their ownership... A source in the web hosting industry who wanted to remain anonymous told ZDNet that what happened this weekend is often referred to as "deadpooling" -- namely, the practice of setting up a small web hosting company, providing ultra-cheap VPS servers for a few dollars a month, and then shutting down a few months later, without refunding customers. "This is a systemic issue within the low-end market, we call it deadpooling," the source told us. "It doesn't happen often at this scale, however." ZDNet provided this alphabetical list of the 20 companies: ArkaHosting, Bigfoot Servers, DCNHost, HostBRZ, HostedSimply, Hosting73, KudoHosting, LQHosting, MegaZoneHosting, n3Servers, ServerStrong, SnowVPS, SparkVPS, StrongHosting, SuperbVPS, SupremeVPS, TCNHosting, UMaxHosting, WelcomeHosting, X4Servers. However, "A user who was impacted by his VPS provider's shutdown also told ZDNet that the number of VPS providers going down is most likely higher than 20, as not all customers might have shared the email notification online, with others."

Read more of this story at Slashdot.

11:34

Open-Source Security Nonprofit Tries Raising Money With 'Hacker-Themed' T-Shirts [Slashdot]

The nonprofit Open Source Technology Improvement Fund connects open-source security projects with funding and logistical support. (Launched in 2015, the Illinois-based group includes on its advisory council representatives from DuckDuckGo and the OpenVPN Project.) To raise more money, they're now planning to offer "hacker-themed swag" and apparel created with a state-of-the art direct-to-garment printer -- and they're using Kickstarter to help pay for that printer: With the equipment fully paid for, we will add a crucial revenue stream to our project so that we can get more of our crucial work funded. OSTIF is kicking-in half of the funding for the new equipment from our own donated funds from previous projects, and we are raising the other half through this KickStarter. We have carefully selected commercial-grade equipment, high quality materials, and gathered volunteers to work on the production of the shirts and wallets. Pledges of $15 or more will be rewarded with an RFID-blocking wallet that blocks "drive-by" readers from scanning cards in your pocket, engraved with the message of your choice. And donors pledging $18 or more get to choose from their "excellent gallery" of t-shirts. Dozens of artists have contributed more than 40 specially-commissioned "hacker-themed" designs, including "Resist Surveillance" and "Linux is Communism" (riffing on a 2000 remark by Microsoft's CEO Steve Ballmer). There's also shirts commemorating Edward Snowden (including one with an actual NSA document leaked by Edward Snowden) as well as a mock concert t-shirt for the "world tour" of the EternalBlue exploit listing locations struck after it was weaponized by the NSA. One t-shirt even riffs on the new millennial catchphrase "OK boomer" -- replacing it with the phrase "OK Facebook" using fake Cyrillic text. And one t-shirt design shows an actual critical flaw found by the OSTIF while reviewing OpenVPN 2.4.0. So far they have 11 backers, earning $790 of their $45,000 goal.

Read more of this story at Slashdot.

10:34

Will Robots Wipe Out Wall Street's Highest-Paying Jobs? [Slashdot]

An anonymous reader quotes Bloomberg: Robots have replaced thousands of routine jobs on Wall Street. Now, they're coming for higher-ups. That's the contention of Marcos Lopez de Prado, a Cornell University professor and the former head of machine learning at AQR Capital Management LLC, who testified in Washington on Friday about the impact of artificial intelligence on capital markets and jobs. The use of algorithms in electronic markets has automated the jobs of tens of thousands of execution traders worldwide, and it's also displaced people who model prices and risk or build investment portfolios, he said. "Financial machine learning creates a number of challenges for the 6.14 million people employed in the finance and insurance industry, many of whom will lose their jobs -- not necessarily because they are replaced by machines, but because they are not trained to work alongside algorithms," Lopez de Prado told the U.S. House Committee on Financial Services.

Read more of this story at Slashdot.

09:58

Saturday Morning Breakfast Cereal - Badness [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
You can categorize countries by how many derivatives you need to take before you can feel good about yourself.


Today's News:

09:34

81-Year-Old Donald Knuth Releases New TAOCP Book, Ready to Write Hexadecimal Reward Checks [Slashdot]

In 1962, 24-year-old Donald Knuth began writing The Art of Computer Programming -- and 57 years later, he's still working on it. But he's finally released The Art of Computer Programming, Volume 4, Fascicle 5: Mathematical Preliminaries Redux; Introduction to Backtracking; Dancing Links. An anonymous reader writes: On his personal site at Stanford, 81-year-old Donald Knuth promised this newly-released section "will feature more than 650 exercises and their answers, designed for self-study," and he shared an excerpt from "the hype on its back cover": This fascicle, brimming with lively examples, forms the first third of what will eventually become hardcover Volume 4B. It begins with a 27-page tutorial on the major advances in probabilistic methods that have been made during the past 50 years, since those theories are the key to so many modern algorithms. Then it introduces the fundamental principles of efficient backtrack programming, a family of techniques that have been a mainstay of combinatorial computing since the beginning. This introductory material is followed by an extensive exploration of important data structures whose links perform delightful dances. That section unifies a vast number of combinatorial algorithms by showing that they are special cases of the general XCC problem --- "exact covering with colors." The first fruits of the author's decades-old experiments with XCC solving are presented here for the first time, with dozens of applications to a dazzling array of questions that arise in amazingly diverse contexts... Knuth is still offering his famous hexadecimal reward checks (now referred to as "reward certificates," since they're drawn on the imaginary Bank of San Serriffe) to any reader who finds a technical (or typographical) error. "Of course those exercises, like those in Fascicle 6, include many cutting-edge topics that weren't easy for me to boil down into their essentials. So again I'm hoping to receive 'Dear Don' letters...either confirming that at least somebody besides me believes that I did my job properly, or pointing out what I should really have said...." And to make it easier he's even shared a list of the exercises where he's still "seeking help and reassurance" about the correctness of his answers. "Let me reiterate that you don't have to work the exericse first. You're allowed to peek at the answer; indeed, you're encouraged to do so, in order to verify that the answer is 100% correct."

Read more of this story at Slashdot.

08:36

Unified sizeof_member() Re-Proposed For Linux 5.5 [Phoronix]

After not being merged for Linux 5.4, the new sizeof_member() macro as a unified means of calculating the size of a member of a struct has been volleyed for Linux 5.5 for possible inclusion on this last day of the merge window...

08:34

Tesla Wants To Clean Windshields With Laser Beams [Slashdot]

Tesla "may be keen on replacing the humble windshield wiper with lasers," reports CNET. In a patent application filed this past May and published with the United States Patent and Trademark Office on Nov. 21, Tesla describes a "pulsed laser cleaning" for "debris accumulated" on glass, specifically for automotive application. It also mentions this could be used for "photo-voltaic" applications. That's fancy-speak for solar panels... According to the patent, Tesla imagines the system would work with a beam optics assembly to produce a laser that hunts down debris. A detection circuitry would be responsible for telling the system where to fire and remove dirt, grime and droppings. This same system would also take into account the laser's exposure level with pulses to ensure it didn't cut through the glass or harm occupants inside. Specifically, a calibration would "limit penetration of the laser beam to a depth that is less than a thickness of the glass article." Such a system could do without chemicals and sprayers to take care of cleaning the windshield. Ditto for camera lenses and solar panels installed on a vehicle or structure.

Read more of this story at Slashdot.

07:34

Uber Loses $1.4 Billion in Value After Acknowledging Thousands of Sexual Assaults [Slashdot]

"Uber's stock market value fell by $1.4 billion Friday, on the heels of the company's release of a safety report revealing that 3,000 incidents of sexual assaults took place during its U.S. rides in 2018," reports the Bay Area News Group: On Thursday evening, Uber released its long-awaited safety study, which revealed that the company received 3,045 reports of sexual assaults in its rides in 2018, and 2,936 such incidents in 2017. Those figures included 235 reports of rape in 2018, up from 229 in 2017, and thousands of other assaults ranging from unwanted touching, kissing or attempted rape. Between 2017 and 2018, the company said it averaged about 3.1 million rides in the U.S. each day. Uber is not the only ride-hailing company grappling with safety issues, though. Earlier Thursday, its top rival, Lyft, was sued by 20 women alleging they were raped or sexually assaulted by Lyft drivers. In a statement on Twitter following the safety report's release Thursday, Uber Chief Executive Dara Khosrowshahi pledged to take further measures to protect the safety of both passengers and drivers. The editorial boards of two Silicon Valley newspapers said the report seems to be an attempt "to confront legitimate problems head-on and transparently," asking how the figures compare to those for taxicabs and applauding Uber for instituting tighter background checks on drivers and adding more safety features to Uber's app. "But it also must acknowledge that these safety issues should have been anticipated. Entrepreneurs aren't doing themselves -- or their industry -- any favors when they fail to anticipate problems and only act on consumer issues after the fact."

Read more of this story at Slashdot.

06:00

Mozilla Releases DeepSpeech 0.6 With Better Performance, Leaner Speech-To-Text Engine [Phoronix]

One of the side projects Mozilla continues to develop is DeepSpeech, a speech-to-text engine derived from research by Baidu and built atop TensorFlow with both CPU and NVIDIA CUDA acceleration. This week marked the release of Mozilla DeepSpeech 0.6 with performance optimizations, Windows builds, lightening up the language models, and other changes...

05:45

KDE Plasma 5.18 Introducing Built-In Emoji Picker [Phoronix]

KDE Plasma is gearing up for 2020 by introducing a built-in emoji picker... Coming with Plasma 5.18 is easier support for inserting Unicode emojis...

05:34

Can We Kill Fake News With Blockchain? [Slashdot]

"One of the more unique future uses for blockchain may be thwarting fake news," writes Computerworld's senior reporter, citing a recent report from Gartner: By 2023, up to 30% of world news and video content will be authenticated as real by blockchain ledgers, countering "Deep Fake technology," according to Avivah Litan, a Gartner vice president of research and co-author of the "Predicts 2020: Blockchain Technology" report... "Tracking assets and proving provenance are two key successful use cases for permissioned blockchain and can be readily applied to tracking the provenance of news content...." The New York Times is one of the first major news publications to test blockchain to authenticate news photographs and video content, according to Gartner. The newspaper's Research and Development team and IBM have partnered on the News Provenance Project, which uses Hyperledger Fabric's permissioned blockchain to store "contextual metadata." That metadata includes when and where a photo or video was shot, who took it and how and when it was edited and published... Blockchain could also change how corporate public offerings are done. Many private companies forgo a public offering because of the complexity of the process. With blockchain ledgers, securities linked to digital tokens could move more easily between financial institutions by simply bypassing a central clearance organization. The article also envisions a world with encrypted digital online credentials on a blockchain ledger that are linked back to the bank, employer, or government agency that created them. Ken Elefant, managing director of investment firm Sorenson Capital, sees this technology creating a world of super-secure profiles where "only the individuals or companies authorized have access to that data -- unlike the internet today where you and I can be marketed to by anybody,"

Read more of this story at Slashdot.

01:34

Reddit Bans 61 Accounts, Citing 'Coordinated' Russian Campaign To Interfere In UK Vote [Slashdot]

"The prospect of Russian interference in Britain's election flared anew Saturday after the social media platform Reddit concluded that people from Russia leaked confidential British government documents on Brexit trade talks just days before the general U.K. vote," reports the Associated Press: Reddit said in a statement that it has banned 61 accounts suspected of violating policies against vote manipulation. It said the suspect accounts shared the same pattern of activity as a Russian interference operation dubbed "Secondary Infektion" that was uncovered earlier this year. Reddit investigated the leak after the documents became public during the campaign for Thursday's election, which will determine the country's future relationship with the European Union. All 650 seats in the House of Commons are up for grabs. Reddit said it believed the documents were leaked as "part of a campaign that has been reported as originating from Russia." "We were able to confirm that they did indeed show a pattern of coordination," Reddit said... Culture Secretary Nicky Morgan told the BBC that the government is "looking for and monitoring" anything that might suggest interference in the British election. "From what was being put on that (Reddit) website, those who seem to know about these things say that it seems to have all the hallmarks of some form of interference,"" Morgan said. "And if that is the case, that obviously is extremely serious."

Read more of this story at Slashdot.

00:54

The GCC Git Conversion Heats Up With Hopes Of Converting Over The Holidays [Phoronix]

Decided back at the GNU Tools Cauldron was a timeline to aim converting from Subversion as their default revision control system to Git over the New Year's holiday. For that to happen, by the middle of December they wanted to decide what conversion system to use for bringing all their SVN commits to Git. As such, now it's heating up ahead of that decision...

Saturday, 07 December

22:05

Raptor Computing Is Working On More AMD Radeon Driver Improvements For POWER [Phoronix]

Raptor Computing Systems, the libre hardware company behind the POWER9-based Talos II server board and Blackbird micro-ATX desktop, has been working to improve the open-source AMD Radeon graphics driver support for IBM POWER...

21:34

Are You Ready for the End of Python 2? [Slashdot]

"Users of an old version of the popular Python language face a reckoning at the end of the year," reports Wired, calling it a programmer's "own version of update hell." The developers who maintain Python, who work for a variety of organizations or simply volunteer their time, say they will stop supporting Python 2 on January 1, 2020 -- more than a decade after the introduction of Python 3 in December 2008. That means no more security fixes or other updates, at least for the official version of Python. The Python team extended the initial deadline in 2015, after it became apparent that developers needed more time to make the switch. It's hard to say how many organizations still haven't made the transition. A survey of developers last year by programming toolmaker JetBrains found that 75 percent of respondents use Python 3, up from 53 percent the year before. But data scientist Vicki Boykis points out in an article for StackOverflow that about 40 percent of software packages downloaded from the Python code management system PyPI in September were written in Python 2.7. For many companies, the transition remains incomplete. Even Dropbox, which employed Python creator Guido van Rossum until his retirement last month, still has some Python 2 code to update. Dropbox engineer Max Belanger says shifting the company's core desktop application from Python 2 to Python 3 took three years. "It wasn't a lot of absolute engineering work," Belanger says. "But it took a long time because stability is so important. We wanted to make sure our users didn't feel any effects of the transition." The transition from Python 2 to 3 is challenging in part because of the number and complexity of other tools that programmers use. Programmers often rely on open source bundles of code known as "libraries" that handle common tasks, such as connecting to databases or verifying passwords. These libraries spare developers from having to rewrite these features from scratch. But if you want to update your code from Python 2 to Python 3, you need to make sure all the libraries you use also have made the switch. "It isn't all happening in isolation," Belanger says. "Everyone has to do it." Today, the 360 most popular Python packages are all Python 3-compatible, according to the site Python 3 Readiness. But even one obscure library that hasn't updated can cause headaches. Python's core team is now prioritizing smaller (but more frequent) updates to make it easier to migrate to newer versions, according to the article, noting that Guido Van Rossum "wrote last month that there might not ever be a Python 4. The team could just add features to Python 3 indefinitely that don't break backward compatibility."

Read more of this story at Slashdot.

19:34

Researchers Call Chronic Inflammation 'A Substantial Public Health Crisis' [Slashdot]

UPI reports: Roughly half of all deaths worldwide are caused by inflammation-related diseases. Now, a team of international researchers is calling on physicians to focus greater attention on the diagnosis, prevention and treatment of severe, chronic inflammation so that people can live longer, healthier lives. In a commentary published Friday in the journal Nature Medicine, researchers at 22 institutions describe how persistent and severe inflammation in the body is often a precursor for heart disease, cancer, kidney disease, diabetes, and autoimmune and neurodegenerative disorders. The researchers point to inflammation-related conditions as the cause of roughly 50 percent of all deaths worldwide. "This is a substantial public health crisis," co-author George Slavich, a research scientist at the Norman Cousins Center for Psychoneuroimmunology at UCLA, said in a statement. "It's also important to recognize that inflammation is a contributor not just to physical health problems, but also mental health problems such as anxiety disorders, depression, PTSD, schizophrenia, self-harm and suicide." In the commentary, Slavich and his fellow authors describe inflammation as a naturally occurring response by the body's immune system that helps it fight illness and infection. However, when inflammation is chronic, it can increase the risk for developing potentially deadly diseases.

Read more of this story at Slashdot.

17:34

Will Plunging Battery Prices Start a Boom In Electric Power? [Slashdot]

An anonymous reader quotes Utility Dive: Average market prices for battery packs have plunged from $1,100 per kilowatt hour in 2010 to $156 per kilowatt hour in 2019, an 87% fall in real terms, according to a report released Tuesday by Bloomberg New Energy Finance (BNEF). Prices are projected to fall to around $100 per kilowatt hour by 2023, driving electrification across the global economy, according to BNEF's forecast. BNEF's latest forecast, from its 2019 Battery Price Survey, is an example of how advancements in battery technology have driven down costs at rates faster than previously predicted. Three years ago, when battery prices were around $300 per kilowatt hour, BNEF projected they would fall to $120 per kilowatt hour by 2030... The cost of lithium-ion batteries mandates the cost of electric vehicles for consumers and the ability of battery storage projects to compete in electricity markets. As they get cheaper, batteries will be used in more industry sectors. "For example, the electrification of commercial vehicles, like delivery vans, is becoming increasingly attractive," BNEF said. Earlier this year, Amazon placed an order for 100,000 all-electric delivery vans from Michigan-based start-up manufacturer Rivian. Just this week, Reuters reported that DHL will run pilot programs for its StreetScooter electric delivery vehicles in U.S. cities, starting in 2020.

Read more of this story at Slashdot.

16:34

'Why Are Cops Around the World Using This Outlandish Mind-Reading Tool?' [Slashdot]

ProPublica has determined that dozens of state and local agencies have purchased "SCAN" training from a company called LSI for reviewing a suspect's written statements -- even though there's no scientific evidence that it works. Local, state and federal agencies from the Louisville Metro Police Department to the Michigan State Police to the U.S. State Department have paid for SCAN training. The LSI website lists 417 agencies nationwide, from small-town police departments to the military, that have been trained in SCAN -- and that list isn't comprehensive, because additional ones show up in procurement databases and in public records obtained by ProPublica. Other training recipients include law enforcement agencies in Australia, Belgium, Canada, Israel, Mexico, the Netherlands, Singapore, South Africa and the United Kingdom, among others... For Avinoam Sapir, the creator of SCAN, sifting truth from deception is as simple as one, two, three. 1. Give the subject a pen and paper. 2. Ask the subject to write down his/her version of what happened. 3. Analyze the statement and solve the case. Those steps appear on the website for Sapir's company, based in Phoenix. "SCAN Unlocks the Mystery!" the homepage says, alongside a logo of a question mark stamped on someone's brain. The site includes dozens of testimonials with no names attached. "Since January when I first attended your course, everybody I meet just walks up to me and confesses!" one says. [Another testimonial says "The Army finally got its money's worth..."] SCAN saves time, the site says. It saves money. Police can fax a questionnaire to a hundred people at once, the site says. Those hundred people can fax it back "and then, in less than an hour, the investigator will be able to review the questionnaires and solve the case." In 2009 the U.S. government created a special interagency task force to review scientific studies and independently investigate which interrogation techniques worked, assessed by the FBI, CIA and the U.S. Department of Defense. "When all 12 SCAN criteria were used in a laboratory study, SCAN did not distinguish truth-tellers from liars above the level of chance," the review said, also challenging two of the method's 12 criteria. "Both gaps in memory and spontaneous corrections have been shown to be indicators of truth, contrary to what is claimed by SCAN." In a footnote, the review identified three specific agencies that use SCAN: the FBI, CIA and U.S. Army military intelligence, which falls under the Department of Defense... In 2016, the same year the federal task force released its review of interrogation techniques, four scholars published a study on SCAN in the journal Frontiers in Psychology. The authors -- three from the Netherlands, one from England -- noted that there had been only four prior studies in peer-reviewed journals on SCAN's effectiveness. Each of those studies (in 1996, 2012, 2014 and 2015) concluded that SCAN failed to help discriminate between truthful and fabricated statements. The 2016 study found the same. Raters trained in SCAN evaluated 234 statements -- 117 true, 117 false. Their results in trying to separate fact from fiction were about the same as chance.... Steven Drizin, a Northwestern University law professor who specializes in wrongful convictions, said SCAN and assorted other lie-detection tools suffer from "over-claim syndrome" -- big claims made without scientific grounding. Asked why police would trust such tools, Drizin said: "A lot has to do with hubris -- a belief on the part of police officers that they can tell when someone is lying to them with a high degree of accuracy. These tools play in to that belief and confirm that belief." SCAN's creator "declined to be interviewed for this story," but they spoke to some users of the technique. Travis Marsh, the head of an Indiana sheriff's department, has been using the tool for nearly two decades, while acknowledging that he can't explain how it works. "It really is, for lack of a better term, a faith-based system because you can't see behind the curtain." Pro Publica also reports that "Years ago his wife left a note saying she and the kids were off doing one thing, whereas Marsh, analyzing her writing, could tell they had actually gone shopping. His wife has not left him another note in at least 15 years..."

Read more of this story at Slashdot.

16:09

SUSE Revives Patches For Exposing /proc/cpuinfo Data Over Sysfs [Phoronix]

Back in 2017 were patches for exposing /proc/cpuinfo data via sysfs for more easily parsing selected bits of information from the CPU information output. That work never made it into the mainline kernel but now SUSE's Thomas Renninger is taking over and trying to get revised patches into the kernel...

15:34

Volkswagen Headquarters Raided Again After They Disclosed New Diesel Filtering 'Issue' [Slashdot]

"Reuters is reporting that German public prosecutors have again raided the Wolfsburg headquarters of Volkswagen in the latest investigation into the carmaker's diesel emissions," writes Slashdot reader McGruber. The purpose of the raid was to "confiscate documents," the article reports: Volkswagen, which admitted in 2015 to cheating U.S. emissions tests on diesel engines, said it was fully cooperating with the authorities, but viewed the investigation as unfounded.... The carmaker said it had itself disclosed the issue at the center of the new investigation -- which is targeting individual employees -- to the relevant registration authorities... Volkswagen said the raids were linked to an investigation into diesel cars with engine type EA 288, a successor model to the EA 189 which was at the heart of the test cheating scandal... In simulations, vehicles with the EA 288 engine did not indicate a failure of the diesel filter, while still complying with emissions limits, Volkswagen said, adding the engine did not have an illegal defeat device.

Read more of this story at Slashdot.

14:34

The U.S. Considers Ban on Exporting Surveillance Technology To China [Slashdot]

The South China Morning Post reports that the U.S. may be taking a stand against China. This week the U.S. House of Representatives passed a new bill that would "tighten export controls on China-bound U.S. technology that could be used to 'suppress individual privacy, freedom of movement and other basic human rights' [and] ordering the U.S. president, within four months of the legislation's enactment, to submit to Congress a list of Chinese officials deemed responsible for, or complicit in, human rights abuses in Xinjiang... "The UIGHUR Act also demands that, on the same day, those individuals are subject to sanctions under the Global Magnitsky Act, seizing their U.S.-based assets and barring them from entry onto U.S. soil." Reuters notes that American government officials "have sounded the alarm on China's detention of at least a million Uighur Muslims, by U.N. estimates, in the northwestern region of Xinjiang as a grave abuse of human rights and religious freedom..." U.S. congressional sources and China experts say Beijing appears especially sensitive to provisions in the Uighur Act passed by the House of Representatives this week banning exports to China of items that can be used for surveillance of individuals, including facial and voice-recognition technology... A U.S. congressional source also said a Washington-based figure close to the Chinese government told him recently it disliked the Uighur bill more than the Hong Kong bill for "dollars and cents reasons," because the former measure contained serious export controls on money-spinning security technology, while also threatening asset freezes and visa bans on individual officials. Victor Shih, an associate professor of China and Pacific Relations at the University of California, San Diego, said mass surveillance was big business in China and a number of tech companies there could be hurt by the law if it passes. China spent roughly 1.24 trillion yuan ($176 billion) on domestic security in 2017 -- 6.1% of total government spending and more than was spent on the military. Budgets for internal security, of which surveillance technology is a part, have doubled in regions including Xinjiang and Beijing.

Read more of this story at Slashdot.

13:34

Remembering Star Trek Writer DC Fontana, 1939-2019 [Slashdot]

Long-time Slashdot reader sandbagger brings the news that D. C. Fontana, an influential story editor and writer on the original 1960s TV series Star Trek, has died this week. People reports: The writer is credited with developing the Spock character's backstory and "expanding Vulcan culture," SyFy reported of her massive contribution to the beloved sci-fi series. Fontana was the one who came up with Spock's childhood history revealed in "Yesteryear," an episode in Star Trek: The Animated Series, on which she was both the story editor and associate producer. As the outlet pointed out, Fontana was also responsible for the characters of Spock's parents, the Vulcan Sarek and human Amanda, who were introduced in the notable episode "Journey to Babel." In fact, Fontana herself said that she hopes to be remembered for bringing Spock to life. "Primarily the development of Spock as a character and Vulcan as a history/background/culture from which he sprang," she said in a 2013 interview published on the Star Trek official site, when asked what she thought her contributions to the series were. With Star Trek creator Gene Roddenberry, she also penned the episode "Encounter at Farpoint," which launched The Next Generation in 1987. The episode introduced Captain Picard, played by Patrick Stewart, and earned the writing pair a Hugo Award nomination. Fontana was one of four Star Trek writers who re-wrote Harlan Ellison's classic episode The City on the Edge of Forever , and her profile at IMDB.com credits her with the story or teleplay for 11 episodes of the original series. In the 1970s Fontana worked on other sci-fi television shows, including Land of the Lost, The Six Million Dollar Man, and the Logan's Run series. Fontana later also wrote an episode of Star Trek: Deep Space Nine, three episodes of Babylon 5, and even an episode of the fan-created science fiction webseries Star Trek: New Voyages.

Read more of this story at Slashdot.

12:34

Jury Sides With Elon Musk, Rejects $190M Defamation Claim Over Tweet [Slashdot]

Aighearach (Slashdot reader #97,333) shared this story from Reuters: Tesla Inc boss Elon Musk emerged victorious on Friday from a closely watched defamation trial as a federal court jury swiftly rejected the $190 million claim brought against him by a British cave explorer who Musk had branded a "pedo guy" on Twitter. The unanimous verdict by a panel of five women and three men was returned after roughly 45 minutes of deliberation on the fourth day of Musk's trial. Legal experts believe it was the first major defamation lawsuit brought by a private individual over remarks on Twitter to be decided by a jury... The jury's decision signals a higher legal threshold for challenging potentially libelous Twitter comments, said L. Lin Wood, the high-profile trial lawyer who led the legal team for the plaintiff, Vernon Unsworth... Other lawyers specializing in defamation agreed the verdict reflects how the freewheeling nature of social media has altered understandings of what distinguishes libel punishable in court from casual rhetoric and hyperbole protected as free speech. Musk, 48, who had testified during the first two days of the trial in his own defense and returned to court on Friday to hear closing arguments, exited the courtroom after the verdict and said: "My faith in humanity is restored."

Read more of this story at Slashdot.

11:34

Why Is Russia's Suspected Internet Cable Spy Ship In the Mid-Atlantic? [Slashdot]

"Russia's controversial intelligence ship Yantar has been operating in the Caribbean, or mid-Atlantic, since October," writes defense analyst H I Sutton this week in Forbes. He adds that the ship "is suspected by Western navies of being involved in operations on undersea communications cables." Significantly, she appears to be avoiding broadcasting her position via AIS (Automated Identification System). I suspect that going dark on AIS is a deliberate measure to frustrate efforts to analyse her mission. She has briefly used AIS while making port calls, where it would be expected by local authorities, for example while calling at Trinidad on November 8 and again on November 28. However in both cases she disappeared from AIS tracking sites almost as soon as she left port... Yantar has been observed conducting search patterns in the vicinity of internet cables, and there is circumstantial evidence that she has been responsible for internet outages, for example off the Syrian coast in 2016. Yantar is "allegedly an 'oceanographic research vessel'," notes Popular Mechanics, in a mid-November article headlined "Why is Russia's spy ship near American waters?" A study by British think tank Policy Exchange mentioned that the ship carried two submersibles capable of tapping undersea cables for information -- or outright cutting them, the Forbes article points out. "Whether Yantar's presence involves undersea cables, or some other target of interest to the Russians, it will be of particular interest to U.S. forces."

Read more of this story at Slashdot.

10:34

Apple Fails To Stop Class Action Lawsuit Over MacBook Butterfly Keyboards [Slashdot]

Mark Wilson quotes BetaNews: Apple has failed in an attempt to block a class action lawsuit being brought against it by a customer who claimed the company concealed the problematic nature of the butterfly keyboard design used in MacBooks. The proposed lawsuit not only alleges that Apple concealed the fact that MacBook, MacBook Pro and MacBook Air keyboards were prone to failure, but also that design defects left customers out of pocket because of Apple's failure to provide an effective fix. Engadget argues that Apple "might face an uphill battle in court. "While the company has never said the butterfly keyboard design was inherently flawed, it instituted repair programs for that keyboard design and even added the latest 13-inch MacBook Pro to the program the moment it became available. Also, the 16-inch MacBook Pro conspicuously reverted to scissor switches in what many see as a tacit acknowledgment that the earlier technology was too fragile."

Read more of this story at Slashdot.

10:00

CentOS 6 Through CentOS 8 Benchmarks On Intel Xeon [Phoronix]

Complementing the CentOS 8 benchmarks I did following the release of that Red Hat Enterprise Linux 8 rebuild in late September, here are tests going back further for showing the performance of CentOS 6, CentOS 7, and CentOS 8 all benchmarked from the same Intel Xeon Scalable server. These tests were done about a month ago albeit with all the hardware launches, new child, and other factors, only now getting to posting the data.

09:34

Hospitals' New Issue: A 'Glut' of Machines Making Alarm Sounds [Slashdot]

"Tens of thousands of alarms shriek, beep and buzz every day in every U.S. hospital," reports Fierce Healthcare -- even though most of them aren't urgent, disturb the patients, and won't get immediate attention anyways: The glut of noise means that the medical staff is less likely to respond. Alarms have ranked as one of the top 10 health technological hazards every year since 2007, according to the research firm ECRI Institute. That could mean staffs were too swamped with alarms to notice a patient in distress or that the alarms were misconfigured. The Joint Commission, which accredits hospitals, warned the nation about the "frequent and persistent" problem of alarm safety in 2013. It now requires hospitals to create formal processes to tackle alarm system safety... The commission has estimated that of the thousands of alarms going off throughout a hospital every day, an estimated 85% to 99% do not require clinical intervention. Staff, facing widespread "alarm fatigue" can miss critical alerts, leading to patient deaths. Patients may get anxious about fluctuations in heart rate or blood pressure that are perfectly normal, the commission said.... In the past 30 years, the number of medical devices that generate alarms has risen from about 10 to nearly 40, said Priyanka Shah, a senior project engineer at ECRI Institute. A breathing ventilator alone can emit 30 to 40 different noises, she said... Maria Cvach, an alarm expert and director of policy management and integration for Johns Hopkins Health System, found that on one step-down unit (a level below intensive care) in the hospital in 2006, an average of 350 alarms went off per patient per day -- from the cardiac monitor alone.... By customizing alarm settings and converting some audible alerts to visual displays at nurses' stations, Cvach's team at Johns Hopkins reduced the average number of alarms from each patient's cardiac monitor from 350 to about 40 per day, she said. Hospitals are also installing sophisticated software to analyze and prioritize the constant stream of alerts before relaying the information to staff members.

Read more of this story at Slashdot.

08:36

Saturday Morning Breakfast Cereal - Card [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I would actually pay good money for a guy who stands on the street in a tophat telling you uplifting macroeconomic data.


Today's News:

08:15

RADV's ACO Compiler Back-End Now Supported For Older AMD "Sea Islands" GPUs [Phoronix]

The Valve-backed "ACO" compiler back-end for the open-source Radeon "RADV" Vulkan driver has added support now for AMD GCN 1.1 "Sea Islands" graphics cards...

07:14

Some Of The Interesting Open-Source Projects For Outreachy's Winter 2019 Round [Phoronix]

Outreachy recently kicked off their winter (December to March) round of internships for diversity in tech with 49 individuals tackling a range of open-source tasks...

04:36

Wayland's Weston 8.0 Reaches Alpha With EGL Partial Updates, Headless OpenGL [Phoronix]

Weston release manager Simon Ser on Friday released the Wayland's Weston 8.0 reference compositor in alpha form...

03:25

Linux 5.5 KVM Adds POWER Support For Secure Guests/VMs [Phoronix]

IBM's work from over a year ago in working towards secure virtual machines on POWER hardware is finally coming to fruition with Linux 5.5 due out early next year...

03:01

OpenBSD bugs, Microsoft's bad update, a new Nork hacking crew, and more [The Register]

Meanwhile, the DOJ sets its sights on money mules

Welcome to yet another El Reg security roundup. Off we go.…

01:13

Mesa 20.0 Now Includes Intel's Gallium3D Driver To Build By Default [Phoronix]

As part of the ongoing effort for Intel's plans to use their new Gallium3D OpenGL Linux driver by default on next quarter's Mesa 20.0 for Broadwell "Gen8" graphics and newer, another step in that direction was achieved on Friday...

Friday, 06 December

22:59

Linux 5.5 Adds NFS Client Support For Cross-Device Offloaded Copies (Server To Server) [Phoronix]

With NFSv4.2 is the server-side copy (SSC) functionality with the Linux 5.5 kernel's NFS client-side support for that support in allowing "inter" copy offloads between different NFS servers...

22:02

GRUB Now Supports Btrfs 3/4-Copy RAID1 Profiles (RAID1C3 / RAID1C4 On Linux 5.5+) [Phoronix]

When it comes to the storage/file-system changes with the in-development Linux 5.5 kernel one of the most prominent end-user-facing changes is more robust RAID1 for Btrfs with the ability to have three or four copies of the data rather than just two copies, should data safety be of utmost importance and concerned over the possibility of two disks in an array failing...

17:40

FTC kicks feet through ash pile that once was Cambridge Analytica with belated verdict [The Register]

Trade boss says long-dead biz was indeed deceiving the public

The US Federal Trade Commission has issued what looks to be a largely symbolic ruling against the remnants of data-harvesting marketers Cambridge Analytica.…

17:15

Elon Musk gets thumbs up from jury for use of 'pedo guy' in cave diver defamation lawsuit [The Register]

CEO's tweeted taunt totally fine, twelve jurors decide

Billionaire Elon Musk did not defame British cave explorer Vernon Unsworth, a Los Angeles jury concluded on Friday.…

16:32

Debian Developers Take To Voting Over Init System Diversity [Phoronix]

It's been five years already since the vote to transition to systemd in Debian over Upstart while now there is the new vote that has just commenced for judging the interest in "init system diversity" and just how much Debian developers care (or not) in supporting alternatives to systemd...

16:09

Forget sharks with lasers, NASA kits out an elephant seal with a sensor-studded skullcap [The Register]

We're never gonna survive unless, we get a little crazy

NASA’s science team has a new female recruit and she's probing the watery depths of the Antarctic in a quest to help climate eggheads understand our climate.…

14:31

WebAssembly gets nod from W3C and, most likely, an embrace from cryptojackers online [The Register]

Standardization of wasm for the web offers a new take on the same old problems

The World Wide Web Consortium (W3C) on Thursday published three WebAssembly specifications as W3C Recommendations, officially endorsing a technology touted for the past few years as a way to accelerate web code, to open the web to more programming languages, and to make code created for the web more portable and safe.…

13:16

Wine 5.0 Code Freeze To Begin Next Week [Phoronix]

As expected by Wine's annual release cadence, next week Wine 5.0 will enter its code freeze followed by release candidates until this next stable Wine release is ready to ship around early 2020...

13:07

China fires up 'Great Cannon' denial-of-service blaster, points it toward Hong Kong [The Register]

Protest organizers come under fire from network traffic barrage

China is reportedly using the 'cannon' capabilities of its massive domestic internet to try and take down anti-government websites in Hong Kong.…

12:50

Google Reaffirms Commitment To Kotlin Programming Language For Android [Phoronix]

The Kotlin programming language on Android has become very popular and Google announced today nearly 60% of the top 1,000 Android applications are using Kotlin code in some capacity. Beyond their announcement earlier this year of Android development being Kotlin-first, as they look forward to 2020 will be more Kotlin + Android action...

12:00

Samsung Galaxy S11 tipped to escalate the phone cam arms race with 108MP sensor [The Register]

Squaring up to the iPhone 11

Samsung is expected to next year release its newest flagship, the Galaxy S11. And, as is the case with any high-profile phone release, details are steadily leaking from the chaebol's notoriously porous supply chain.…

12:00

Ardour Digital Audio Workstation Finally Adds Native MP3 Importing Support [Phoronix]

While lossy compression audio formats like MP3 are not recommended for use within professional audio tasks, for those using the open-source Ardour digital audio workstation (DAW) software as of today there is finally native MP3 import support...

11:15

Cloud vendors burp after last year's server sales feast, couldn't possibly eat any more [The Register]

Not even a wafer-thin blade?

The global server market hasn't been able to match the heady highs of 2018 so far this year and Q3 was no exception as both shipments and the value of those boxes dropped.…

10:25

Ireland's B.ICONIC snaffles Stormfront to become largest Apple reseller in the UK [The Register]

May we suggest a rebrand?

B.ICONIC, the parent of one of Ireland's largest Apple Premium Resellers (APRs), is buying Stormfront – the UK Apple retailer that is, rather than the Aryan social network.…

09:55

Still in preview, but look! You can now develop Azure Sphere apps in Linux – if you dare [The Register]

19.11 brings penguin support and a Visual Studio Code extension

Microsoft's forever-in-preview Azure Sphere received an important update this week, bringing a Linux SDK (also in preview form) and Visual Studio Code support.…

09:01

Nokia 2.3: HMD flings out €109 budget 'droid with a 2-day battery [The Register]

But get ready to flip your cables cos it's microUSB

HMD Global, the licencee of the once-ubiquitous Nokia mobile phone brand, today unveiled its latest budget blower, the Nokia 2.3.…

08:16

Ohhh, you're so rugged! Microsoft swoons at new Lenovo box pushing Azure to the edge [The Register]

Fix it to a wall, stick it on a shelf

While the public cloud might have once been all the rage, the cold light of day has brought the realisation that bandwidth, compliance and convenience means that something a little more local is needed.…

07:57

Systemd-homed Looks Like It Will Merged Soon For systemd 245 [Phoronix]

Announced back in September at the All Systems Go event in Berlin was systemd-homed as a new effort to improve home directory handling. Systemd-homed wants to make it easier to migrate home directories, ensure all user data is self-contained, unify user-password and encryption handling, and provide other modern takes on home/user directory functionality. That code is expected to soon land in systemd...

07:43

Listen up you bunch of bankers. Here are some pointers for less crap IT [The Register]

UK regulators hash out cheat sheet to avoid total meltdown

The Bank of England has teamed up with other regulators to offer UK banks a little advice on sorting out their woeful IT systems.…

07:39

Saturday Morning Breakfast Cereal - Podcast [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Honestly, I'm not sure this hasn't been done already.


Today's News:

07:06

Cloudflare’s Response to CSAM Online [The Cloudflare Blog]

Cloudflare’s Response to CSAM Online

Responding to incidents of child sexual abuse material (CSAM) online has been a priority at Cloudflare from the beginning. The stories of CSAM victims are tragic, and bring to light an appalling corner of the Internet. When it comes to CSAM, our position is simple: We don’t tolerate it. We abhor it. It’s a crime, and we do what we can to support the processes to identify and remove that content.

In 2010, within months of Cloudflare’s launch, we connected with the National Center for Missing and Exploited Children (NCMEC) and started a collaborative process to understand our role and how we could cooperate with them. Over the years, we have been in regular communication with a number of government and advocacy groups to determine what Cloudflare should and can do to respond to reports about CSAM that we receive through our abuse process, or how we can provide information supporting investigations of websites using Cloudflare’s services.

Recently, 36 tech companies, including Cloudflare, received this letter from a group of U.S Senators asking for more information about how we handle CSAM content. The Senators referred to influential New York Times stories published in late September and early November that conveyed the disturbing number of images of child sexual abuse on the Internet, with graphic detail about the horrific photos and how the recirculation of imagery retraumatizes the victims. The stories focused on shortcomings and challenges in bringing violators to justice, as well as efforts, or lack thereof, by a group of tech companies including Amazon, Facebook, Google, Microsoft, and Dropbox, to eradicate as much of this material as possible through existing processes or new tools like PhotoDNA that could proactively identify CSAM material.  

We think it is important to share our response to the Senators (copied at the end of this blog post), talk publicly about what we’ve done in this space, and address what else we believe can be done.

How Cloudflare Responds to CSAM

From our work with NCMEC, we know that they are focused on doing everything they can to validate the legitimacy of CSAM reports and then work as quickly as possible to have website operators, platform moderators, or website hosts remove that content from the Internet. Even though Cloudflare is not in a position to remove content from the Internet for users of our core services, we have worked continually over the years to understand the best ways we can contribute to these efforts.

Addressing  Reports

The first prong of Cloudflare’s response to CSAM is proper reporting of any allegation we receive. Every report we receive about content on a website using Cloudflare’s services filed under the “child pornography” category on our abuse report page leads to three actions:

  1. We forward the report to NCMEC. In addition to the content of the report made to Cloudflare, we provide NCMEC with information identifying the hosting provider of the website, contact information for that hosting provider, and the origin IP address where the content at issue can be located.
  2. We forward the report to both the website operator and hosting provider so they can take steps to remove the content, and we provide the origin IP of where the content is located on the system so they can locate the content quickly. (Since 2017, we have given reporting parties the opportunity to file an anonymous report if they would prefer that either the host or the website operator not be informed of their identity).
  3. We provide anyone who makes a report information about the identity of the hosting provider and contact information for the hosting provider in case they want to follow up directly.

Since our founding, Cloudflare has forwarded 5,208 reports to NCMEC. Over the last three years, we have provided 1,111 reports in 2019 (to date), 1,417 in 2018, and 627 in 2017.  

Reports filed under the “child pornography” category account for about 0.2% of the abuse complaints Cloudflare receives. These reports are treated as the highest priority for our Trust & Safety team and they are moved to the front of the abuse response queue. We are generally able to respond by filing the report with NCMEC and providing the additional information within a matter of minutes regardless of time of day or day of the week.

Cloudflare’s Response to CSAM Online

Requests for Information

The second main prong of our response to CSAM is operation of our “trusted  reporter” program to provide relevant information to support the investigations of nearly 60 child safety organizations around the world. The "trusted reporter" program was established in response to our ongoing work with these organizations and their requests for both information about the hosting provider of the websites at issue as well as information about the origin IP address of the content at issue. Origin IP information, which is generally sensitive security information because it would allow hackers to circumvent certain security protections for a website, like DDoS protections, is provided to these organizations through dedicated channels on an expedited basis.

Like NCMEC, these organizations are responsible for investigating reports of CSAM on websites or hosting providers operated out of their local jurisdictions, and they seek the resources to identify and contact those parties as quickly as possible to have them remove the content. Participants in the “trusted reporter” program include groups like the Internet Watch Foundation (IWF), the INHOPE Foundation, the Australian eSafety Commission, and Meldpunt. Over the past five years, we have responded to more than 13,000 IWF requests, and more than 5,000 requests from Meldpunt. We respond to such requests on the same day, and usually within a couple of hours. In a similar way, Cloudflare also receives and responds to law enforcement requests for information as part of investigations related to CSAM or exploitation of a minor.

Cloudflare’s Response to CSAM Online

Among this group, the Canadian Centre for Child Protection has been engaged in a unique effort that is worthy of specific mention. The Centre’s Cybertip program operates their Project Arachnid initiative, a novel approach that employs an automated web crawler that proactively searches the Internet to identify images that match a known CSAM hash, and then alerts hosting providers when there is a match. Based on our ongoing work with Project Arachnid, we have responded to more than 86,000 reports by providing information about the hosting provider and provide the origin IP address, which we understand they use to contact that hosting provider directly with that report and any subsequent reports.

Although we typically process these reports within a matter of hours, we’ve heard from participants in our “trusted reporter” program that the non-instantaneous response from us causes friction in their systems. They want to be able to query our systems directly to get the hosting provider and origin IP  information, or better, be able to build extensions on their automated systems that could interface with the data in our systems to remove any delay whatsoever. This is particularly relevant for folks in the Canadian Centre’s Project Arachnid, who want to make our information a part of their automated system.  After scoping out this solution for a while, we’re now confident that we have a way forward and informed some trusted reporters in November that we will be making available an API that will allow them to obtain instantaneous information in response to their requests pursuant to their investigations. We expect this functionality to be online in the first quarter of 2020.

Termination of Services

Cloudflare takes steps in appropriate circumstances to terminate its services from a site when it becomes clear that the site is dedicated to sharing CSAM or if the operators of the website and its host fail to take appropriate steps to take down CSAM content. In most circumstances, CSAM reports involve individual images that are posted on user generated content sites and are removed quickly by responsible website operators or hosting providers. In other circumstances, when operators or hosts fail to take action, Cloudflare is unable on its own to delete or remove the content but will take steps to terminate services to the  website.  We follow up on reports from NCMEC or other organizations when they report to us that they have completed their initial investigation and confirmed the legitimacy of the complaint, but have not been able to have the website operator or host take down the content. We also work with Interpol to identify and discontinue services from such sites they have determined have not taken steps to address CSAM.

Based upon these determinations and interactions, we have terminated service to 5,428 domains over the past 8 years.

In addition, Cloudflare has introduced new products where we do serve as the host of content, and we would be in a position to remove content from the Internet, including Cloudflare Stream and Cloudflare Workers.  Although these products have limited adoption to date, we expect their utilization will increase significantly over the next few years. Therefore, we will be conducting scans of the content that we host for users of these products using PhotoDNA (or similar tools) that make use of NCMEC’s image hash list. If flagged, we will remove that content immediately. We are working on that functionality now, and expect it will be in place in the first half of 2020.

Part of an Organized Approach to Addressing CSAM

Cloudflare’s approach to addressing CSAM operates within a comprehensive legal and policy backdrop. Congress and the law enforcement and child protection communities have long collaborated on how best to combat the exploitation of children. Recognizing the importance of combating the online spread of CSAM, NCMEC first created the CyberTipline in 1998, to provide a centralized reporting system for members of the public and online providers to report the exploitation of children online.

In 2006, Congress conducted a year-long investigation and then passed a number of laws to address the sexual abuse of children. Those laws attempted to calibrate the various interests at stake and coordinate the ways various parties should respond. The policy balance Congress struck on addressing CSAM on the Internet had a number of elements for online service providers.

First, Congress formalized NCMEC’s role as the central clearinghouse for reporting and investigation, through the CyberTipline. The law adds a requirement, backed up by fines, for online providers to report any reports of CSAM to NCMEC. The law specifically notes that to preserve privacy, they were not creating a requirement to monitor content or affirmatively search or screen content to identify possible reports.

Second, Congress responded to the many stories of child victims who emphasized the continuous harm done by the transmission of imagery of their abuse. As described by NCMEC, “not only do these images and videos document victims’ exploitation and abuse, but when these files are shared across the internet, child victims suffer re-victimization each time the image of their sexual abuse is viewed” even when viewed for ostensibly legitimate investigative purposes. To help address this concern, the law directs providers to minimize the number of employees provided access to any visual depiction of child sexual abuse.  

Finally, to ensure that child safety and law enforcement organizations had the records necessary to conduct an investigation, the law directs providers to preserve not only the report to NCMEC, but also “any visual depictions, data, or other digital files that are reasonably accessible and may provide context or additional information about the reported material or person” for a period of 90 days.

Cloudflare’s Response to CSAM Online

Because Cloudflare’s services are used so extensively—by more than 20 million Internet properties, and based on data from W3Techs, more than 10% of the world’s top 10 million websites—we have worked hard to understand these policy principles in order to respond appropriately in a broad variety of circumstances. The processes described in this blogpost were designed to make sure that we comply with these principles, as completely and quickly as possible, and take other steps to support the system’s underlying goals.

Conclusion

We are under no illusion that our work in this space is done. We will continue to work with groups that are dedicated to fighting this abhorrent crime and provide tools to more quickly get them information to take CSAM content down and investigate the criminals who create and distribute it.

Cloudflare's Senate Response (PDF)

Cloudflare's Senate Res... by Cloudflare on Scribd

07:00

Continuous Lifecycle London blind bird offer takes off soon: Book your place today at our DevOps conference [The Register]

Save yourself a few quid and we'll see you in May 2020

Event  If DevOps, containers, CI/CD and serverless are on your agenda for next year, grabbing a blind-bird ticket for our Continuous Lifecycle London conference should be top of your end-of-year todo list.…

06:50

Apple: Mysterious iPhone 11 location pings were because of 'ultra-wideband compliance' [The Register]

NVM, we'll give you a toggle to deactivate UWB... in the future-ture-ture

For a company that prizes itself on its privacy credentials, Apple received a bit of a bloody nose earlier this week when long-time security journalist Brian Krebs revealed the iPhone 11 Pro intermittently seeks the user’s location — even when there are no applications with location permissions in use.…

05:59

Hey kids! Forget about Disney – who fancies a trip to DevOps World? [The Register]

Come with us through the gates of Jenkins Land to admire the Java dinosaurs within

DevOps World Lisbon  Love was in the air at the CloudBees-sponsored DevOps World in Lisbon this week as the 900 or so attendees were treated to public displays of affection with Google both on stage and behind the scenes.…

05:23

RadeonSI NIR Benchmarks Show Great Progress With Mesa 20.0 [Phoronix]

With AMD last week having enabled OpenGL 4.6 for their RadeonSI OpenGL Linux driver when enabling the NIR intermediate representation support, you may be wondering how using NIR is stacking up these days compared to the default TGSI route. Here are some benchmarks on Polaris, Vega, and Navi for comparing this driver option that ultimately allows OpenGL 4.6 to be flipped on.

05:08

Reasons to be fearful 2020: Smishing, public Wi-Fi, deepfakes... and all the usual suspects [The Register]

Too soon for New Year Resolutions?

Cybercriminals will continue to exploit tried-and-tested fraud methods but also adopt a couple of new takes and targets in the year ahead.…

04:26

A General Notification Queue Was Pushed Back From Linux 5.5 Introduction [Phoronix]

Red Hat has been working on a "general notification queue" that is built off the Linux kernel's pipe code and will notify the user-space of events like key/keyring changes, block layer events like disk errors, USB attach/remove events, and other notifications without user-space having to continually poll kernel interfaces. This general notification queue was proposed for Linux 5.5 but has been pushed back to at least 5.6...

04:10

DeepMind founder behind NHS data slurp to be beamed up to Google mothership [The Register]

Great job, now let's do some applied AI with the big boys

Mustafa Suleyman, one of the founders of DeepMind, is to join Google's applied AI division.…

02:02

Doogee Wowser: The S40's a terrible smartphone, but a passable projectile [The Register]

How the worst mobe I ever used maimed an American teen

Comment  Earlier this year, I reviewed arguably the worst phone I've ever used in eight years of covering tech for a living: the Doogee S40. I've always prided myself on my fairness, but I genuinely couldn't find a silver lining to this appalling waste of rare-earth metals. It had a crap screen, a weak camera, and was frustratingly slow to use.…

02:00

NetworkManager Adds Support For Enhanced Open / Opportunistic Wireless Encryption [Phoronix]

Opportunistic Wireless Encryption (OWE) provides a means of encrypting wireless data transfers without having any secret/key. Opportunistic Wireless Encryption is advertised as Wi-Fi Certified Enhanced Open...

01:05

Whoooooa, this node is on fire! Forget Ceph, try the forgotten OpenStack storage release 'Crispy' [The Register]

Behold the 'heaving monstrosity of pulsing evil'

On Call  Friday has arrived once again with a tale from the smouldering world of On Call.…

01:00

5 cool terminal pagers in Fedora [Fedora Magazine]

Large files like logs or source code can run into the thousands of lines. That makes navigating them difficult, particularly from the terminal. Additionally, most terminal emulators have a scrollback buffer of only a few hundred lines. That can make it impossible to browse large files in the terminal using utilities which print to standard output like cat, head and tail. In the early days of computing, programmers solved these problems by developing utilities for displaying text in the form of virtual “pages” — utilities imaginatively described as pagers.

Pagers offer a number of features which make text file navigation much simpler, including scrolling, search functions, and the ability to feature as part of a pipeline of commands. In contrast to most text editors, some terminal pagers do not require loading the entire file for viewing, which makes them faster, especially for very large files.

In the modern era of Linux computing, terminal emulators are more sophisticated than ever. They offer support for a kaleidoscope of colors, terminal resizing, as well as a host of other features to make parsing text on screen easier and more efficient. Terminal pagers have undergone a similar evolution, from extremely simple UNIX utilities like pg and more, to sophisticated programs with a wide range of features, covering any number of use cases. With this in mind, we’ve put together a list of some of the most popular terminal paging utilities — more or less.

More

more is one of the earliest pagers, initially featured in version 3.0 BSD. The first implementation of more was written in 1978 by Daniel Halbert. Since then, more has become a ubiquitous feature of many operating systems, including Windows, OS/2, MacOS and most linux distributions.

more is a very lightweight utility. The version featured in util-linux runs to just under 2100 lines of C. However, this small footprint comes at a price. Most versions of more feature relatively limited functionality, with no support for backwards scroll or search. Commands are similarly stripped back: press enter to scroll one line, or space to scroll one page. Some other useful commands include:

  • Press v while reading to open the current file in your default terminal editor.
  • ‘/pattern‘ let’s you search for the next occurrence of pattern.
  • :n and :p will open the next and previous files respectively when more is called with more than one file as arguments

Less

less was initially conceived as a successor to more, addressing some of its limitations. Building on the functionality of more, less adds a number of useful features including backwards scroll, backwards search. It is also more amenable to window resizing.

Navigation in less is similar to more, though less borrows a few useful commands from the vi editor as well. Users can navigate the document using the familiar home row navigational keys. A glance at the man page for less reveals a fairly rich repertoire of available commands. Some particularly useful examples include:

  • ?pattern lets you search backwards in the file for pattern
  • &pattern shows only lines which feature pattern. This is particularly useful for those who find themselves issuing $ grep pattern | less regularly.
  • Calling less with the -s (–sqeueeze-blank-lines) flag allows you to view text files with large gaps. Multiple newline characters are reduced to single breaks.
  • s filename, called from within the program, saves input to filename (if input is a pipe).
  • Alternatively, calling less with the -o filename flag will save the input of less to filename.

With this enhanced functionality comes a little extra weight. The version of less that ships with Fedora at the time of writing clocks in at around 25000 lines of source code. Granted, for all but the most storage constrained systems, this is a non-issue. Besides, less is more than more.

Most

While less aims to expand on the existing capabilities of more, most takes a different approach. Rather than expanding on the traditional single file view, most gives users the ability to split their view into “windows.” Each window contains different files in different viewing modes.
Significantly, most takes into account the width of its input text. The default viewing mode doesn’t wrap text (-S in less), a feature particularly useful when dealing with “wide” files. While these design decisions might represent a significant departure from tradition for some users, the end result is very powerful.

In addition to the navigation commands offered by more, most uses intuitive mnemonics for file navigation. For example, t moves to the top of a file, and b moves to the bottom. As a result, users unfamiliar with vi and its descendants will find most to be refreshingly simple.

The distinguishing feature of most is its ability to split windows and contexts quickly and easily. For example, one could open two distinct text files using the following:

$ most textFile1.txt textFile2.txt

In order to split the screen horizontally, use the key combos Ctrl+x, 2 or Ctrl+w, 2. The command :n will open the next file argument in a given window, offering a split screen view of two files:

If you turn wrap off in one window, it does not affect the behavior of other windows. The \ character indicates a wrap or fold, while the $ character indicates that the file extends past the limitations of the current window.

pspg

Those who work with SQL databases often need to be able to examine the contents of our databases at a glance. The command line interfaces for many popular open source DBMS’s, such as MySQL and PostGreSQL, use the system default pager to view outputs that don’t fit on a single screen. Utilities like more and less are designed around the idea of presenting text files, but for more structured data, leave something to be desired. Naive text paginating programs have no concept of broad, tabular data, which can be frustrating when dealing with large queries.

pspg attempts to address this by offering users the ability to freeze columns while viewing, sort data in situ, and colourize output. While pspg was intended initially to serve as a pager replacement for psql specifically, the program also supports the viewing of CSV data, and is a suitable drop-in replacement for mysql and pgcli.

Vim

In a modern, technicolor terminal, the idea of endless pages of drab grey on black text can feel like something of an anachronism. The syntax highlighting options offered by powerful text editors like vim can be useful for browsing source code. Furthermore, the search functions offered by vim vastly outclass the competition. With this in mind, vim ships with a shell script less.sh that lets vim serve as a replacement for conventional pagers.


To set vim as the default pager for man pages, add the following to your shell’s config (such as ~/.bashrc if using the default bash shell):

export MANPAGER="/bin/sh -c \"col -b | vim -c 'set ft=man ts=8 nomod nolist nonu noma' -\""

Alternatively, to set vim as the default pager system-wide, locate the less.sh script. (You can find it at /usr/share/vim/vim81/macros/ on current Fedora systems.) Export this location as the variable PAGER to set it as default, or under an alias to invoke it explicitly.


Photo by Cathy Mü on Unsplash.

00:01

You looking for an AI project? You love Lego? Look no further than this Reg reader's machine-learning Lego sorter [The Register]

All you need is tens of thousands of Lego bricks, a Raspberry Pi, and a laptop GPU

An engineer has built something that is sure to be the envy of any self-respecting Lego fan: an AI-powered Lego sorting machine.…

Thursday, 05 December

23:39

GCC 10's C++20 "Spaceship Operator" Support Appears To Be In Good Shape [Phoronix]

One of the prominent additions coming with the C++20 programming language is the consistent comparison operator, or "spaceship operator" as it's commonly referred to. The support was merged for GCC 10 last month ahead of entering stage three development while this week some more improvements were made to the implementation...

23:00

SANS Announces 13th Holiday Hack Challenge and 2nd KringleCon infosec conference [The Register]

Sign up, tune in, expand your knowledge, and compete in hacking contests

Promo  Next week, SANS will launch its second annual KringleCon virtual conference followed shortly thereafter by its 13th Holiday Hack Challenge.…

22:06

Linux 5.5 Lands Broadcom BCM2711 / Raspberry Pi 4 Bits [Phoronix]

Following last week's Arm architecture updates for Linux 5.5, sent in via four pull requests on Thursday was all the new and improved hardware enablement for the SoCs and single-board computer platforms...

22:01

Tricky VPN-busting bug lurks in iOS, Android, Linux distros, macOS, FreeBSD, OpenBSD, say university eggheads [The Register]

OpenVPN, WireGuard, IKEv2/IPSec also vulnerable to tampering flaw, we're told

A bug in the way Unix-flavored systems handle TCP connections could put VPN users at risk of having their encrypted traffic hijacked, it is claimed.…

19:13

Pentagon's $10bn JEDI decision 'risky for the country and democracy,' says AWS CEO Jassy [The Register]

Presidential 'disdain' may have been a factor in awarding mega-contract to Microsoft, says cloud supremo

re:Invent  Amazon Web Services CEO Andy Jassy faced the press yesterday at Amazon's re:Invent conference in Las Vegas, and there was one thing above all else that journos wanted to discuss.…

18:42

If you want an example of how user concerns do not drive software development, check out this Google-backed API [The Register]

App detection interface sparks privacy worries

Comment  A nascent web API called getInstalledRelatedApps offers a glimpse of why online privacy remains such an uncertain proposition.…

17:52

Asteroid Bennu is flinging particles of dust and rock from its surface – and scientists can't work out why [The Register]

Images beamed back from NASA's OSIRIS-REx spacecraft leave scientists baffled

Pic  A closeup image of Bennu snapped by NASA's OSIRIS-REx spacecraft reveals that the asteroid's surface is surprisingly volatile, randomly spitting out shards of debris into space.…

17:00

16:30

Some Of The Possible Changes Coming For The Desktop With Ubuntu 20.04 LTS [Phoronix]

While we aren't even half-way through the Ubuntu 20.04 LTS development cycle yet, Ubuntu's Trello board provides a look at some of the changes and new features being at least considered for this next Ubuntu long-term support release...

16:05

VCs find exciting new way to blow $1m: Wire it directly to hackers after getting spoofed [The Register]

Who needs an elevator pitch when you have man-in-the-middle attack?

A group of hackers used a compromised email account to steal a start-up's $1m venture capital payment.…

15:25

Debian Installer Bullseye Alpha 1 Released [Phoronix]

Debian 11 "Bullseye" isn't expected to be released until well into 2021 but out today is the first alpha release of the Debian Installer that will ultimately power that next major Debian GNU/Linux release...

15:22

If there's somethin' stored in a secure enclave, who ya gonna call? Membuster! [The Register]

Boffins ride the memory bus past Intel's SGX to your data

Computer scientists from UC Berkeley, Texas A&M, and semiconductor biz SK Hynix have found a way to defeat secure enclave protections by observing memory requests from a CPU to off-chip DRAM through the memory bus.…

14:22

Uncle Sam challenged in court for slurping social media info on 'millions' of visa applicants [The Register]

Documentary filmmakers lob sue ball to halt practice

The US State Department is being sued over its policy of crawling the social media accounts of people applying for entry visas.…

13:30

Following the wild, roaring success of its Snapdragon 8cx Arm laptop chip, Qualcomm's back with the 8c, 7c [The Register]

Looking forward to seeing these in, well, anything would be nice

Qualcomm will today expand its range of Snapdragon system-on-chips for always-connected Arm-based Windows 10 tablet-laptops from one to three.…

12:10

Your duckface better be flawless: Huawei's Nova 6 mobe has a needlessly powerful selfie camera [The Register]

Highest-ranked front shooter yet for the poser in your life

The middle ground of the smartphone market is a bit of a battleground. Manufacturers of all stripes – except Apple – keep flinging devices at punters with fairly high-end specs, but price tags under the £500 mark. The latest salvo comes from embattled Chinese comms giant Huawei, which today announced the launch of its Nova 6 handset.…

11:55

NVIDIA Looks To Have Some Sort Of Open-Source Driver Announcement For 2020 [Phoronix]

Start looking forward to March when NVIDIA looks to have some sort of open-source driver initiative to announce -- likely contributing more to Nouveau and we're crossing our fingers they will have sorted out the signed firmware situation to unblock those developers from delivering re-clocking support to yield better driver performance...

11:20

Scammy and spammy harassers are chasing veteran pros off crypto-collab platform Keybase [The Register]

What happens when you throw your lot in with crypto-coin types

Collaboration site Keybase, once touted for its encrypted meetup channels and robust developer features, is struggling to ward off an epidemic of harassment and spam brought about by its shift toward cryptocurrency.…

10:44

Huawei with your rural subsidies ban: Chinese comms bogeyman fires sueball at US regulator [The Register]

Claims it's unconstitutional

Huawei Technologies today filed a fresh lawsuit against the US Federal Communications Commission over its decision to ban rural carriers from buying the company's mobile hardware with Universal Service Fund (USF) cash.…

09:56

Purism Announces Librem 5 "USA" Model For $1999 USD [Phoronix]

Purism announced today a Librem 5 USA model of their smartphone that has the same specifications and features of their Librem 5 Linux smartphone but manufactured in the US. That pushes the 720x1440 display, i.MX8M, 3GB RAM, 32GB eMMC, 802.11n device from $699 USD to $1,199 USD. Update: Errr the price was raised now apparently to $1999 USD...

09:49

Feds slap $5m bounty on 'Evil Corp' Russian duo accused of running ZeuS, Dridex banking trojans [The Register]

Account-draining malware masterminds charged but remain in motherland

US prosecutors have slapped a $5m bounty on the heads of two Russian nationals they claim are part of the malware gang behind the banking trojans ZeuS and Dridex.…

09:24

Saturday Morning Breakfast Cereal - Who [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
The oldest I get, the more I start sympathizing with villains.


Today's News:

BAHFest is coming back this year, to Houston and to London! Submissions are now open!

09:12

Intel Publishes oneAPI Level 0 Specification [Phoronix]

Back at SC19 Intel released a beta of their oneAPI Base Toolkit for software developers to work on performance-optimized, cross-device software. Complementing that initial software beta is now the oneAPI Level 0 Specification...

08:44

How to fool infosec wonks into pinning a cyber attack on China, Russia, Iran, whomever [The Register]

Learning points, not an instruction manual

Black Hat Europe  Faking digital evidence during a cyber attack – planting a false flag – is simple if you know how, as noted infosec veteran Jake Williams told London's Black Hat Europe conference.…

08:00

An Extensive Look At The AMD Naples vs. Rome Power Efficiency / Performance-Per-Watt [Phoronix]

Since the AMD EPYC 7002 "Rome" series launch in August we have continue to be captivated by the raw performance of AMD's Zen 2 server processors across many different workloads as covered now in countless articles. The performance-per-dollar / TCO is also extremely competitive against Intel's Xeon Scalable line-up, but how is the power efficiency of these 7nm EPYC processors? We waited to deliver those numbers until having a retail Rome board for carrying out those tests and now after that and then several weeks of benchmarking, here is an extensive exploration of the AMD EPYC 7002 series power efficiency as well as a look at the peak clock frequencies being achieved in various workloads to also provide some performance-per-clock metrics compared to Naples.

07:54

Onestream slammed for 'slamming' vulnerable and elderly folk: That's £35k to Ofcom, please [The Register]

Comms provider switched 118 people to its services without their consent

Updated  Imagine a telco kicking down your front door, yelling "all your bills are belong to us" then leaving. In the industry parlance, it's known as "slamming" and Ofcom has fined Onestream £35,000 for the practice.…

07:19

BeOS-Inspired Haiku Continues Working On 64-bit ARM, Other Hardware Improvements [Phoronix]

The open-source Haiku operating system project working off inspirations from BeOS continued to be quite active over the past two months in adding various modern features and fixes to their platform...

07:18

Staffer representation on our board? LMAO! Good one, cackles Microsoft [The Register]

Another $0.51 dividend for shareholders, exec pay OK'd, but no say for employees

Updated  There was good news for investors and perhaps bad news for employees during Microsoft's annual shareholder meeting.…

05:59

Windows 10 Insiders: Begone, foul Store version of Notepad! [The Register]

You say 20H1, they say 2004, let's call the whole thing off

Microsoft has emitted a fresh build of next year's Windows 10 to both the Slow and Fast rings of the Windows Insider programme and goodness, those guinea pigs weren't keen on Notepad-In-The-Store.…

05:42

Icahn and I will force a Xerox and HP wedding: Corporate raider urges HP shareholders to tell board to act 'NOW' [The Register]

Billionaire accuses execs of running scared for jobs amid $33.5bn bid

Corporate raider Carl Icahn isn't quietly accepting HP's rejection of Xerox's hostile $33.5bn takeover bid – he has accused the board of using delay tactics to keep their jobs and warning it can be done in a nice or not-so-nice way.…

05:07

Oil be damned: Iran-based crooks flinging malware at Middle Eastern energy plants again – research [The Register]

ZeroCleare wipes up where Shamoon left off

An Iran-based hacking crew long known to target energy facilities in neighboring Middle Eastern countries is believed to be launching new attacks.…

04:57

Lima Gets Tiling While Vulkan Turnip Lands SSBO + Compute Shaders [Phoronix]

Even with the holidays fast approaching Mesa developers continue to be quite busy in landing new features ahead of next quarter's Mesa 20.0 release. The Lima Gallium3D driver and Turnip Vulkan driver are the latest benefiting from the Git code...

04:36

Motorola's mid-range One Hyper packs 64MP cam, huge screen and – ooo – 'Quad Pixel' tech [The Register]

What's that when it's at home? Oh, they mean pixel binning

Motorola has updated its mid-range lineup with the announcement of the Android 10-powered One Hyper.…

04:06

Microsoft emits long-term support .NET Core 3.1, Visual Studio 16.4 [The Register]

Ready to go, but beware 'unfortunate breaking change' in Windows Forms

Microsoft has released .NET Core 3.1 – a significant milestone as, unlike version 3.0, it is a long-term support (LTS) release, suggesting that the company believes it's fit for extended use. It is accompanied by Visual Studio 16.4, also an LTS release.…

03:02

Take Sajid Javid's comments on IR35 UK contractor rules with a bucket of salt, warns tax guru [The Register]

What now? A pre-election porky? Heaven forfend…

Conservative Party claims they may review the extension of IR35 tax rules to the UK private sector have been called into question by a tax expert.…

02:00

Just in case you were expecting 10Gbps, Wi-Fi 6 hits 700Mbps in real-world download tests [The Register]

Pretty fly for a Wi-Fi...

Updated  The long-awaited future of super-fast wireless is here, with the Wireless Broadband Alliance (WBA) claiming speeds of 700Mbps in a real-world environment using the Wi-Fi 6 standard.…

01:01

We know this sounds weird but in future we could ask fiber optic cables: Did the earth move for you... literally? [The Register]

Distributed acoustic sensing turns old glass cabling into seismic sensors

Old, unused fiber optic cables buried underground can be refashioned into seismometers, helping scientists monitor earthquakes, according to new research.…

00:28

Thinking about color [The Cloudflare Blog]

Color is my day-long obsession, joy and torment - Claude Monet

Thinking about color
Thinking about color

Over the last two years we’ve tried to improve our usage of color at Cloudflare. There were a number of forcing functions that made this work a priority. As a small team of designers and engineers we had inherited a bunch of design work that was a mix of values built by multiple teams. As a result it was difficult and unnecessarily time consuming to add new colors when building new components.

We also wanted to improve our accessibility. While we were doing pretty well, we had room for improvement, largely around how we used green. As our UI is increasingly centered around visualizations of large data sets we wanted to push the boundaries of making our analytics as visually accessible as possible.

Cloudflare had also undergone a rebrand around 2016. While our marketing site had rolled out an updated set of visuals, our product ui as well as a number of existing web properties were still using various versions of our old palette.

Our product palette wasn’t well balanced by itself. Many colors had been chosen one or two at a time. You can see how we chose blueberry, ice, and water at a different point in time than marine and thunder.

Thinking about color
The color section of our theme file was partially ordered chronologically

Lacking visual cohesion within our own product, we definitely weren’t providing a cohesive visual experience between our marketing site and our product. The transition from the nice blues and purples to our green CTAs wasn’t the streamlined experience we wanted to afford our users.

Thinking about color
Our app dashboard in 2017

Reworking our Palette

Our first step was to audit what we already had. Cloudflare has been around long enough to have more than one website. Beyond cloudflare.com we have dozens of web properties that are publicly accessible. From our community forums, support docs, blog, status page, to numerous micro-sites.

All-in-all we have dozens of front-end codebases that each represent one more chance to introduce entropy to our visual language. So we were curious to answer the question - what colors were we currently using? Were there consistent patterns we could document for further reuse? Could we build a living style guide that didn’t cover just one site, but all of them?

Thinking about color
Screenshots of pages from cloudflare.com contrasted with screenshots from our product in 2017

Our curiosity got the best of us and we went about exploring ways we could visualize our design language across all of our sites.

Thinking about color
Above - our product palette. Below - our marketing palette.

A time machine for color

As we first started to identify the scale of our color problems, we tried to think outside the box on how we might explore the problem space. After an initial brainstorming session we combined the Internet Archive's Wayback Machine with the Css Stats API to build an audit tool that shows how our various websites visual properties change over time. We can dynamically select which sites we want to compare and scrub through time to see changes.

Below is a visualization of palettes from 9 different websites changing over a period of 6 years. Above the palettes is a component that spits out common colors, across all of these sites. The only two common colors across all properties (appearing for only a brief flash) were #ffffff (white) and transparent. Over time we haven’t been very consistent with ourselves.

Thinking about color

If we drill in to look at our marketing site compared to our dashboard app - it looks like the video below. We see a bit more overlap at first and then a significant divergence at the 16 second mark when our product palette grew significantly. At the 22 second mark you can see the marketing palette completely change as a result of the rebrand while our product palette stays the same. As time goes on you can see us becoming more and more inconsistent across the two code bases.

Thinking about color

As a product team we had some catching up to do improve our usage of color and to align ourselves with the company brand. The good news was, there was no where to go but up.

This style of historical audit gives us a visual indication with real data. We can visualize for stakeholders how consistent and similar our usage of color is across products and if we are getting better or worse over time. Having this type of feedback loop was invaluable for us - as auditing this manually is incredibly time consuming so it often doesn’t get done. Hopefully in the future as it’s standard to track various performance metrics over time at a company it will be standard to be able to visualize your current levels of design entropy.

Picking colors

After our initial audit revealed there wasn’t a lot of consistency across sites, we went to work to try and construct a color palette that could potentially be used for sites the product team owned. It was time to get our hands dirty and start “picking colors.”

Hindsight of course is always 20/20. We didn’t start out on day one trying to generate scales based on our brand palette. No, our first bright idea, was to generate the entire palette from a single color.

Our logo is made up of two oranges. Both of these seemed like prime candidates to generate a palette from.

Thinking about color

We played around with a number of algorithms that took a single color and created a palette. From the initial color we generated an array scales for each hue. Initial attempts found us applying the exact same curves for luminosity to each hue, but as visual perception of hues is so different, this resulted in wildly different contrasts at each step of the scale.

Below are a few of our initial attempts at palette generation. Jeeyoung Jung did a brilliant writeup around designing palettes last year.

Thinking about color
Visualizing peaks of intensity across hues

We can see the intensity of the colors change across hue in peaks, with yellow and green being the most dominant. One of the downsides of this, is when you are rapidly iterating through theming options, the inconsistent relationships between steps across hues can make it time consuming or impossible to keep visual harmony in your interface.

The video below is another way to visualize this phenomenon. The dividing line in the color picker indicates which part of the palette will be accessible with black and white. Notice how drastically the line changes around green and yellow. And then look back at the charts above.

Thinking about color
Demo of https://kevingutowski.github.io/color.html

After fiddling with a few different generative algorithms (we made a lot of ugly palettes…) we decided to try a more manual approach. We pursued creating custom curves for each hue in an effort to keep the contrast scales as optically balanced as possible.

Thinking about color
Heavily muted palette
Thinking about color

Generating different color palettes makes you confront a basic question. How do you tell if a palette is good? Are some palettes better than others? In an effort to answer this question we constructed various feedback loops to help us evaluate palettes as quickly as possible. We tried a few methods to stress test a palette. At first we attempted to grab the “nearest color” for a bunch of our common UI colors. This wasn't always helpful as sometimes you actually want the step above or below the closest existing color. But it was helpful to visualize for a few reasons.

Thinking about color
Generated palette above a set of components previewing the old and new palette for comparison

Sometime during our exploration in this space, we stumbled across this tweet thread about building a palette for pixel art. There are a lot of places web and product designers can draw inspiration from game designers.

Thinking about color
Two color palettes visualized to create 3d objects
Thinking about color
A color palette applied in a few different contexts

Here we see a similar concept where a number of different palettes are applied to the same component. This view shows us two things, the different ways a single palette can be applied to a sphere, and also the different aesthetics across color palettes.

Thinking about color
Different color palettes previewed against a common component

It’s almost surprising that the default way to construct a color palette for apps and sites isn’t to build it while previewing its application against the most common UI patterns. As designers, there are a lot of consistent uses of color we could have baselines for. Many enterprise apps are centered around white background with blue as the primary color with mixtures of grays to add depth around cards and page sections. Red is often used for destructive actions like deleting some type of record. Gray for secondary actions. Maybe it’s an outline button with the primary color for secondary actions. Either way - the margins between the patterns aren’t that large in the grand scheme of things.

Consider the use case of designing UI while the palette or usage of color hasn't been established. Given a single palette, you might want to experiment with applying that palette in a variety of ways that will output a wide variety of aesthetics. Alternatively you may need to test out several different palettes. These are two different modes of exploration that can be extremely time consuming to work through . It can be non-trivial to keep an in progress design synced with several different options for color application, even with the best use of layer comps or symbols.

How do we visualize the various ways a palette will look when applied to an interface? Here are examples of how palettes are shown on a palette list for pixel artists.

Thinking about color
Thinking about color
https://lospec.com/palette-list/vines-flexible-linear-ramps

One method of visualization is to define a common set of primitive ui elements and show each one of them with a single set of colors applied. In isolation this can be helpful. This mode would make it easy to vet a single combination of colors and which ui elements it might be best applied to.

Alternatively we might want to see a composed interface with the closest colors from the palette applied. Consider a set of buttons that includes red, green, blue, and gray button styles. Seeing all of these together can help us visualize the relative nature of these buttons side by side. Given a baseline palette for common UI, we could swap to a new palette and replace each color with the "closest" color. This isn't always a full-proof solution as there are many edge cases to cover. e.g. what happens when replacing a palette of 134 colors with a palette of 24 colors? Even still, this could allow us to quickly take a stab at automating how existing interfaces would change their appearance given a change to the underlying system. Whether locally or against a live site, this mode of working would allow for designers to view a color in multiple contets to truly asses its quality.

Thinking about color

After moving on from the idea of generating a palette from a single color, we attempted to use our logo colors as well as our primary brand colors to drive the construction of modular scales. Our goal was to create a palette that would improve contrast for accessibility, stay true to our visual brand, work predictably for developers, work for data visualizations, and provide the ability to design visually balanced and attractive interfaces. No sweat.

Thinking about color
Brand colors showing Hue and Saturation level

While we knew going in we might not use every step in every hue, we wanted full coverage across the spectrum so that each hue had a consistent optical difference between each step. We also had no idea which steps across which hues we were going to need just yet. As they would just be variables in a theme file it didn’t add any significant code footprint to expose the full generated palette either.

One of the more difficult parts, was deciding on a number of steps for the scales. This would allow us to edit the palette in the future to a variety of aesthetics and swap the palette out at the theme level without needing to update anything else.

In the future if when we did need to augment the available colors, we could edit the entire palette instead of adding a one-off addition as we had found this was a difficult way to work over time. In addition to our primary brand colors we also explored adding scales for yellow / gold, violet, teal as well as a gray scale.

The first interface we built for this work was to output all of the scales vertically, with their contrast scores with both white and black on the right hand side. To aid scannability we bolded the values that were above the 4.5 threshold. As we edited the curves, we could see how the contrast ratios were affected at each step. Below you can see an early starting point before the scales were balanced. Red has 6 accessible combos with white, while yellow only has 1. We initially explored having the gray scale be larger than the others.

Thinking about color
Early iteration of palette preview during development

As both screen luminosity and ambient light can affect perception of color we developed on two monitors, one set to maximum and one set to minimum brightness levels. We also replicated the color scales with a grayscale filter immediately below to help illustrate visual contrast between steps AND across hues. Bouncing back and forth between the grayscale and saturated version of the scale serves as a great baseline reference. We found that going beyond 10 steps made it difficult to keep enough contrast between each step to keep them distinguishable from one another.

Taking a page from our game design friends - as we were balancing the scales and exploring how many steps we wanted in the scales, we were also stress testing the generated colors against various analytics components from our component library.

Our slightly random collection of grays had been a particular pain point as they appeared muddy in a number of places within our interface. For our new palette we used the slightest hint of blue to keep our grays consistent and just a bit off from being purely neutral.

Thinking about color
Optically balanced scales

With a palette consisting of 90 colors, the amount of combinations and permutations that can be applied to data visualizations is vast and can result in a wide variety of aesthetic directions. The same palette applied to both line and bar charts with different data sets can look substantially different, enough that they might not be distinguishable as being the exact same palette. Working with some of our engineering counterparts, we built a pipeline that would put up the same components rendered against different data sets, to simulate the various shapes and sizes the graph elements would appear in. This allowed us to rapidly test the appearance of different palettes. This workflow gave us amazing insights into how a palette would look in our interface. No matter how many hours we spent staring at a palette, we couldn't get an accurate sense of how the colors would look when composed within an interface.

Thinking about color
Analytics charts with a blues and oranges. Telling the colors of the lines apart is a different visual experience than separating out the dots in sequential order as they appear in the legend.

We experimented with a number of ideas on visualizing different sizes and shapes of colors and how they affected our perception of how much a color was changing element to element. In the first frame it is most difficult to tell the values at 2% and 6% apart given the size and shape of the elements.

Thinking about color
Stress testing the application of a palette to many shapes and sizes

We’ve begun to package up some of this work into a web app others can use to create or import a palette and preview multiple depths of accessible combinations against a set of UI elements.

The goal is to make it easier for anyone to work seamlessly with color and build beautiful interfaces with accessible color contrasts.

Thinking about color
Color by Cloudflare Design

In an effort to make sure everything we are building will be visually accessible - we built a react component that will preview how a design would look if you were colorblind. The component overlays SVG filters to simulate alternate ways someone can perceive color.

Thinking about color
Analytics component previewed against 8 different types of color blindness

While this is previewing an analytics component, really any component or page can be previewed with this method.

import React from "react"

const filters = [
  'achromatopsia',
  'protanomaly',
  'protanopia',
  'deuteranomaly',
  'deuteranopia',
  'tritanomaly',
  'tritanopia',
  'achromatomaly',
]

const ColorBlindFilter = ({itemPadding, itemWidth, ...props }) => {
  return (
      <div  {...props}>
        {filters.map((filter, i) => (
          <div
            style={{filter: 'url(/filters.svg#'+filter+')'}}
            width={itemWidth}
            px={itemPadding}
            key={i+filter}
          >
            {props.children}
          </div>
        ))}
      </div>
  )
}

ColorBlindFilter.defaultProps = {
  display: 'flex',
  justifyContent: 'space-around',
  flexWrap: 'wrap',
  width: 1,
  itemWidth: 1/4
}

export default ColorBlindFilter

We’ve also released a Figma plugin that simulates this visualization for a component.

After quite a few iterations, we had finally come up with a color palette. Each scale was optically aligned with our brand colors. The 5th step in each scale is the closest to the original brand color, but adjusted slightly so it’s accessible with both black and white.

Thinking about color
Our preview panel for palette development, showing a fully desaturated version of the palette for reference

Lyft’s writeup “Re-approaching color” and Jeeyoung Jung’s “Designing Systematic Colors are some of the best write-ups on how to work with color at scale you can find.

Color migrations

Thinking about color
A visual representation of how the legacy palette colors would translate to the new scales.

Getting a team of people to agree on a new color palette is a journey in and of itself. By the time you get everyone to consensus it’s tempting to just collapse into a heap and never think about colors ever again. Unfortunately the work doesn’t stop at this point. Now that we’ve picked our palette, it’s time to get it implemented so this bike shed is painted once and for all.

If you are porting an old legacy part of your app to be updated to the new style guide like we were, even the best color documentation can fall short in helping someone make the necessary changes.

We found it was more common than expected that engineers and designers wanted to know what the new version of a color they were familiar with was. During the transition between palettes we had an interface people could input any color and get the closest color within our palette.

There are times when migrating colors, the closest color isn't actually what you want. Given the scenario where your brand color has changed from blue to purple, you might want to be porting all blues to the closest purple within the palette, not the closest blues which might still exist in your palette. To help visualize migrations as well as get suggestions on how to consolidate values within the old scale, we of course built a little tool. Here we can define those translations and import a color palette from a URL import. As we still have. a number of web properties to update to our palette, this simple tool has continued to prove useful.

Thinking about color

We wanted to be as gentle as possible in transitioning to the new palette in usage. While the developers found string names for colors brittle and unpredictable, it was still a more familiar system for some than this new one. We first just added in our new palette to the existing theme for usage moving forward. Then we started to port colors for existing components and pages.

For our colleagues, we wrote out desired translations and offered warnings in the console that a color was deprecated, with a reference to the new theme value to use.

Thinking about color
Example of console warning when using deprecated color
Thinking about color
Example of how to check for usage of deprecated values

While we had a few bugs along the way, the team was supportive and helped us fix bugs almost as quickly as we could find them.

We’re still in the process of updating our web properties with our new palette, largely prioritizing accessibility first while trying to create a more consistent visual brand as a nice by-product of the work. A small example of this is our system status page. In the first image, the blue links in the header, the green status bar, and the about copy, were all inaccessible against their backgrounds.

A lot of the changes have been subtle. Most notably the green we use in the dashboard is a lot more inline with our brand colors than before. In addition we’ve also been able to add visual balance by not just using straight black text on background colors. Here we added one of the darker steps from the corresponding scale, to give it a bit more visual balance.

Thinking about color
Example page within our Dashboard in 2017 vs 2019

While we aren’t perfect yet, we’re making progress towards more visual cohesion across our marketing materials and products.

2017

Thinking about color
Our app dashboard in 2017

2019

Next steps

Trying to keep dozens of sites all using the same palette in a consistent manner across time is a task that you can never complete. It’s an ongoing maintenance problem. Engineers familiar with the color system leave, new engineers join and need to learn how the system works. People still launch sites using a different palette that doesn’t meet accessibility standards. Our work continues to be cut out for us. As they say, a garden doesn’t tend itself.

If we do ever revisit our brand colors, we're excited to have infrastructure in place to update our apps and several of our satellite sites with significantly less effort than our first time around.

Resources

Some of our favorite materials and resources we found while exploring this problem space.

Apps

Writing

Code

Videos

00:00

Tune in and watch online today: How to build a content management platform fit for the future [The Register]

Advice based on feedback from Register readers and insights from Box

Webcast  Financial institutions across the board are wrestling with how to engage more closely with customers and work better across internal teams. Too often, the cause is ill-fitting content and document management systems, designed for another time. Meanwhile, cloud-based platforms can both help and hinder, delivering short-term benefit but adding complexity and fragmentation.…

Wednesday, 04 December

22:55

123-Reg is at it again: Registrar charges chap for domains he didn’t order – and didn't want [The Register]

The great .uk foist is still rumbling along

Two months after promising customers that its past practices of automatically registering, and billing, folks for .uk domains was all a big misunderstanding, pushy registrar 123-Reg is at it again – charging at least one punter for .uk domains they never ordered and don’t want.…

18:36

Kubernetes? 'I don't believe in one tool to rule the world,' says AWS's sassy Jassy [The Register]

Collaborating with other companies is such a drag

re:Invent  AWS CEO Andy Jassy, asked about the future role of Kubernetes (K8s) in cloud infrastructure, told The Register that "I don’t believe in one tool to rule the world."…

17:55

Atlassian scrambles to fix zero-day security hole accidentally disclosed on Twitter [The Register]

Exposed private cert key may also be an issue for IBM Aspera

Updated  Twitter security celeb SwiftOnSecurity on Tuesday inadvertently disclosed a zero-day vulnerability affecting enterprise software biz Atlassian, a flaw that may be echoed in IBM's Aspera software.…

17:28

Lazarus group goes back to the Apple orchard with new macOS trojan [The Register]

In-memory malware a first for suspected Nork hacking crew

The Lazarus group, which has been named as one of North Korea's state-sponsored hacking teams, has been found to be using new tactics to infect macOS machines.…

17:00

Streaming Cassandra into Kafka in (Near) Real-Time: Part 1 [Yelp Engineering and Product Blog]

At Yelp, we use Cassandra to power a variety of use cases. As of the date of publication, there are 25 Cassandra clusters running in production, each with varying sizes of deployment. The data stored in these clusters is often required as-is or in a transformed state by other use cases, such as analytics, indexing, etc. (for which Cassandra is not the most appropriate data store). As seen in previous posts from our Data Pipeline series, Yelp has developed a robust connector ecosystem around its data stores to stream data both into and out of the Data Pipeline. This two-part...

09:59

Saturday Morning Breakfast Cereal - Unique [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
What if we are the Vogons? I mean, how would we know?


Today's News:

01:00

Fedora Desktops – Memory Footprints [Fedora Magazine]

There are over 40 desktops in Fedora. Each desktop has it’s own strengths and weaknesses. Usually picking a desktop is a very personal preference based on features, looks, and other qualities. Sometimes, what you pick for a desktop is limited by hardware constraints.

This article is to help people compare Fedora desktops based on the desktop baseline memory. To narrow the scope, we are only looking at the desktops that have an official Fedora Live image.

Installation and Setup

Each of the desktops was installed on it’s own KVM virtual machine. Each virtual machine had 1 CPU, 4GB of memory, 15 GB virtio solid state disk, and everything else that comes standard on RHEL 8.0 kvm.

The images for installation were the standard Fedora 31 Live images. For GNOME, that image was the Fedora Workstation. For the other desktops, the corresponding Spin was used. Sugar On A Stick (SOAS) was not tested because it does not install easily onto a local drive.

The virtual machine booted into the Live CD. “Install to Hard Disk” was selected. During the install, only the defaults were used. A root user, and a regular users were created. After installation and reboot, the Live image was verified to not be in the virtual CDROM.

The settings for each desktop was not touched. They each ran whatever settings came default from the Live CD installation. Each desktop was logged into via the regular user. A terminal was opened. Using sudo each machine ran “dnf -y update”. After update, in that sudo terminal, each machine ran “/sbin/shutdown -h now” to shut down.

Testing

Each machine was started up. The desktop was logged into via the regular user. Three of the desktop terminals were opened. xterm was never used, it was always the terminal for that desktop, such as konsole.

In one terminal, top was started and M pressed, showing the processes sorted by memory. In another terminal, a simple while loop showed “free -m” every 30 seconds. The third terminal was idle.

I then waited 5 minutes. This allowed any startup services to finish. I recorded the final free result, as well as the final top three memory consumers from top.

Results

  • Cinnamon
    • 624 MB Memory used
    • cinnamon 4.8% / Xorg 2.2% / dnfdragora 1.8%
  • GNOME
    • 612 MB Memory used
    • gnome-shell 6.9% / gnome-software 1.8% / ibus-x11 1.5%
  • KDE
    • 733 MB Memory used
    • plasmashell 6.2% / kwin_x11 3.6% / akonadi_mailfil 2.9%
  • LXDE
    • 318 MB Memory used
    • Xorg 1.9% / nm-applet 1.8% / dnfdragora 1.8%
  • LXQt
    • 391 MB Memory used
    • lxqt-panel 2.2% / pcmanfm-qt 2.1% / Xorg 2.1%
  • MATE
    • 465 MB Memory used
    • Xorg 2.5% / dnfdragora 1.8% / caja 1.5%
  • XFCE
    • 448 MB Memory used
    • Xorg 2.3% / xfwm4 2.0% / dnfdragora 1.8%

Conclusion

I will let the numbers speak for themselves.

Remember that these numbers are from a default Live install. If you remove, or add services and features, your memory usage will change. But this is a good baseline to look at if you are determining your desktop based on memory consumption.

Tuesday, 03 December

10:04

Saturday Morning Breakfast Cereal - Affection [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Wild orchids are nice but not all that impressive, so really it's a subtle insult.


Today's News:

00:00

Using Ansible to organize your SSH keys in AWS [Fedora Magazine]

If you’ve worked with instances in Amazon Web Services (AWS) for a long time, you may run into this common issue. It’s not technical, but more to do with the human nature of getting too comfortable. When you launch a new instance in a region you haven’t used recently, you may end up creating a new SSH key pair. This leads to having too many keys, which can become complicated and disordered.

This article shows you a way to have your public key in all regions. A recent Fedora Magazine article includes one solution. But the solution in this article is automated even further, and in a more concise and scalable way.

Say you have a Fedora 30 or 31 desktop system where your key is stored, and Ansible is installed as well. These two things together provide the solution to this problem and many more.

With Ansible’s ec2_key module, you can create a simple playbook that will maintain your SSH key pair in all regions. If you need to add or remove keys, it’s as simple as adding and removing lines from a file.

Setting up and running the playbook

To use the playbook, first install necessary dependencies for the ec2_key module:

$ sudo dnf install python3-boto python3-boto3

The playbook is simple: you need only to change your key and its name as in the example below. After that, run the playbook and it iterates over all the public AWS regions listed. The example also includes the restricted regions in case you have access. To include them, uncomment each line as needed, save the file, and then run the playbook again.

---
- name: Maintain an ssh key pair in ec2
  hosts: localhost
  connection: local
  gather_facts: no
  vars:
    ansible_python_interpreter: python
  tasks:
    - name: Make available your ssh public key in ec2 for new instances
      ec2_key:
        name: "YOUR KEY NAME GOES HERE"
        key_material: 'YOUR KEY GOES HERE'
        state: present
        region: "{{ item }}"
      with_items:
        - us-east-2   #US East (Ohio)
        - us-east-1   #US East (N. Virginia)
        - us-west-1   #US West (N. California)
        - us-west-2   #US West (Oregon)
        - ap-east-1   #Asia Pacific (Hong Kong)
        - ap-south-1   #Asia Pacific (Mumbai)
        - ap-northeast-2  #Asia Pacific (Seoul)
        - ap-southeast-1  #Asia Pacific (Singapore)
        - ap-southeast-2  #Asia Pacific (Sydney)
        - ap-northeast-1  #Asia Pacific (Tokyo)
        - ca-central-1   #Canada (Central)
        - eu-central-1   #EU (Frankfurt)
        - eu-west-1   #EU (Ireland)
        - eu-west-2   #EU (London)
        - eu-west-3   #EU (Paris)
        - eu-north-1   #EU (Stockholm)
        - me-south-1   #Middle East (Bahrain)
        - sa-east-1   #South America (Sao Paulo)
  #      - us-gov-east-1  #AWS GovCloud (US-East)
  #      - us-gov-west-1  #AWS GovCloud (US-West)
  #      - ap-northeast-3 #Asia Pacific (Osaka-Local)
  #      - cn-north-1   #China (Beijing)
  #      - cn-northwest-1 #China (Ningxia)

This playbook requires AWS access via API, as well. To do this, use environment variables as follows:

$ AWS_ACCESS_KEY="aws-access-key-id" AWS_SECRET_KEY="aws-secret-key-id" ansible-playbook ec2-playbook.yml

Another option is to install the aws cli tools and add the credentials as explained in a previous Fedora Magazine article. It is not recommended to insert these values in the playbook if you store it anywhere online! You can find this playbook code on GitHub.

After the playbook finishes, confirm that your key is available on the AWS console. To do that:

  1. Log into your AWS console
  2. Go to EC2 > Key Pairs
  3. You should see your key listed. The only limitation is that you have to check region-by-region with this method.

Another way is to use a quick command in a shell to do this check for you.

First create a variable with all regions on the playbook:

AWS_REGION="us-east-1 us-west-1 us-west-2 ap-east-1 ap-south-1 ap-northeast-2 ap-southeast-1 ap-southeast-2 ap-northeast-1 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 eu-north-1 me-south-1 sa-east-1"

Then do a for loop and you will get the result from aws API:

for each in ${AWS_REGION} ; do aws ec2 describe-key-pairs --key-name <YOUR KEY GOES HERE> ; done

Keep in mind that to do the above you need to have the aws cli installed.

Monday, 02 December

13:29

The Serverlist: Full Stack Serverless, Serverless Architecture Reference Guides, and more [The Cloudflare Blog]

The Serverlist: Full Stack Serverless, Serverless Architecture Reference Guides, and more

Check out our tenth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.

Sign up below to have The Serverlist sent directly to your mailbox.

09:34

Saturday Morning Breakfast Cereal - Thermopolymer [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
The amount of 'well... sorta...' contained in 3 panels here is pretty extraordinary. I'm proud of it.


Today's News:

Sunday, 01 December

09:01

Saturday Morning Breakfast Cereal - Evolution [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Laugh all you want, but that moth is gonna have total choice of mates.


Today's News:

Saturday, 30 November

10:17

Saturday Morning Breakfast Cereal - Layers [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
This is not the least bit definitely about my wife.


Today's News:

Friday, 29 November

09:20

Saturday Morning Breakfast Cereal - Fine [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
In a proper Hemingway story, it's Marlin the delivery girl spent all day catching, but we work with the data we've got.


Today's News:

08:30

Happy Black Friday! We’ve got some new totes up in the shop if... [Sarah's Scribbles]







Happy Black Friday! We’ve got some new totes up in the shop if you’re interested in some holiday shopping! Check em out here

01:00

A quick introduction to Toolbox on Fedora [Fedora Magazine]

Toolbox allows you to sort and manage your development environments in containers without requiring root privileges or manually attaching volumes. It creates a container where you can install your own CLI tools, without installing them on the base system itself. You can also utilize it when you do not have root access or cannot install programs directly. This article gives you an introduction to toolbox and what it does.

Installing Toolbox

Silverblue includes Toolbox by default. For the Workstation and Server editions, you can grab it from the default repositories using dnf install toolbox.

Creating Toolboxes

Open your terminal and run toolbox enter. The utility will automatically request permission to download the latest image, create your first container, and place your shell inside this container.

$ toolbox enter
No toolbox containers found. Create now? [y/N] y
Image required to create toolbox container.
Download registry.fedoraproject.org/f30/fedora-toolbox:30 (500MB)? [y/N]: y

Currently there is no difference between the toolbox and your base system. Your filesystems and packages appear unchanged. Here is an example using a repository that contains documentation source for a resume under a ~/src/resume folder. The resume is built using the pandoc tool.

$ pwd  
/home/rwaltr 
$ cd src/resume/ 
$ head -n 5 Makefile 
all: pdf html rtf text docx

pdf: init
 pandoc -s -o BUILDS/resume.pdf markdown/*

$ make pdf
bash: make: command not found
$ pandoc -v
bash: pandoc: command not found

This toolbox does not have the programs required to build the resume. You can remedy this by installing the tools with dnf. You will not be prompted for the root password, because you are running in a container.

$ sudo dnf groupinstall "Authoring and Publishing" -y && sudo dnf install pandoc make -y
... 
$ make all #Successful builds
mkdir -p BUILDS
pandoc -s -o BUILDS/resume.pdf markdown/*
pandoc -s -o BUILDS/resume.html markdown/*
pandoc -s -o BUILDS/resume.rtf markdown/*
pandoc -s -o BUILDS/resume.txt markdown/*
pandoc -s -o BUILDS/resume.docx markdown/*
$ ls BUILDS/
resume.docx  resume.html  resume.pdf  resume.rtf  resume.txt

Run exit at any time to exit the toolbox.

$ cd BUILDS/
$ pandoc --version || ls
pandoc 2.2.1
Compiled with pandoc-types 1.17.5.4, texmath 0.11.1.2, skylighting 0.7.5
...
for a particular purpose.
resume.docx  resume.html  resume.pdf  resume.rtf  resume.txt
$ exit 
logout
$ pandoc --version || ls
bash: pandoc: command not found...
resume.docx  resume.html  resume.pdf  resume.rtf  resume.txt

You retain the files created by your toolbox in your home directory. None of the programs installed in your toolbox will be available outside of it.

Tips and tricks

This introduction to toolbox only scratches the surface. Here are some additional tips, but you can also check out the official documentation.

  • Toolbox –help will show you the man page for Toolbox
  • You can have multiple toolboxes at once. Use toolbox create -c Toolboxname and toolbox enter -c Toolboxname
  • Toolbox uses Podman to do the heavy lifting. Use toolbox list to find the IDs of the containers Toolbox creates. Podman can use these IDs to perform actions such as rm and stop. (You can also read more about Podman in this Magazine article.)

Photo courtesy of Florian Richter from Flickr.

A History of HTML Parsing at Cloudflare: Part 2 [The Cloudflare Blog]

A History of HTML Parsing at Cloudflare: Part 2
A History of HTML Parsing at Cloudflare: Part 2

The second blog post in the series on HTML rewriters picks up the story in 2017 after the launch of the Cloudflare edge compute platform Cloudflare Workers. It became clear that the developers using workers wanted the same HTML rewriting capabilities that we used internally, but accessible via a JavaScript API.

This blog post describes the building of a streaming HTML rewriter/parser with a CSS-selector based API in Rust. It is used as the back-end for the Cloudflare Workers HTMLRewriter. We have open-sourced the library (LOL HTML) as it can also be used as a stand-alone HTML rewriting/parsing library.

The major change compared to LazyHTML, the previous rewriter, is the dual-parser architecture required to overcome the additional performance overhead of wrapping/unwrapping each token when propagating tokens to the workers runtime. The remainder of the post describes a CSS selector matching engine inspired by a Virtual Machine approach to regular expression matching.

v2 : Give it to everyone and make it faster

In 2017, Cloudflare introduced an edge compute platform - Cloudflare Workers. It was no surprise that customers quickly required the same HTML rewriting capabilities that we were using internally. Our team was impressed with the platform and decided to migrate some of our features to Workers. The goal was to improve our developer experience working with modern JavaScript rather than statically linked NGINX modules implemented in C with a Lua API.

It is possible to rewrite HTML in Workers, though for that you needed a third party JavaScript package (such as Cheerio). These packages are not designed for HTML rewriting on the edge due to the latency, speed and memory considerations described in the previous post.

JavaScript is really fast but it still can’t always produce performance comparable to native code for some tasks - parsing being one of those. Customers typically needed to buffer the whole content of the page to do the rewriting resulting in considerable output latency and memory consumption that often exceeded the memory limits enforced by the Workers runtime.

We started to think about how we could reuse the technology in Workers. LazyHTML was a perfect fit in terms of parsing performance, but it had two issues:

  1. API ergonomics: LazyHTML produces a stream of HTML tokens. This is sufficient for our internal needs. However, for an average user, it is not as convenient as the jQuery-like API of Cheerio.
  2. Performance: Even though LazyHTML is tremendously fast, integration with the Workers runtime adds even more limitations. LazyHTML operates as a simple parse-modify-serialize pipeline, which means that it produces tokens for the whole content of the page. All of these tokens then have to be propagated to the Workers runtime and wrapped inside a JavaScript object and then unwrapped and fed back to LazyHTML for serialization. This is an extremely expensive operation which would nullify the performance benefit of LazyHTML.
A History of HTML Parsing at Cloudflare: Part 2
LazyHTML with V8

LOL HTML

We needed something new, designed with Workers requirements in mind, using a language with the native speed and safety guarantees (it’s incredibly easy to shoot yourself in the foot doing parsing). Rust was the obvious choice as it provides the native speed and the best guarantee of memory safety which minimises the attack surface of untrusted input. Wherever possible the Low Output Latency HTML rewriter (LOL HTML) uses all the previous optimizations developed for LazyHTML such as tag name hashing.

Dual-parser architecture

Most developers are familiar and prefer to use CSS selector-based APIs (as in Cheerio, jQuery or DOM itself) for HTML mutation tasks. We decided to base our API on CSS selectors as well. Although this meant additional implementation complexity, the decision created even more opportunities for parsing optimizations.

As selectors define the scope of the content that should be rewritten, we realised we can skip the content that is not in this scope and not produce tokens for it. This not only significantly speeds up the parsing itself, but also avoids the performance burden of the back and forth interactions with the JavaScript VM. As ever the best optimization is not to do something.

A History of HTML Parsing at Cloudflare: Part 2

Considering the tasks required, LOL HTML’s parser consists of two internal parsers:

  • Lexer - a regular full parser, that produces output for all types of content that it encounters;
  • Tag scanner - looks for start and end tags and skips parsing the rest of the content. The tag scanner parses only the tag name and feeds it to the selector matcher. The matcher will switch parser to the lexer if there was a match or additional information about the tag (such as attributes) are required for matching.

The parser switches back to the tag scanner as soon as input leaves the scope of all selector matches. The tag scanner may also sometimes switch the parser to the Lexer - if it requires additional tag information for the parsing feedback simulation.

A History of HTML Parsing at Cloudflare: Part 2
LOL HTML architecture

Having two different parser implementations for the same grammar will increase development costs and is error-prone due to implementation inconsistencies. We minimize these risks by implementing a small Rust macro-based DSL which is similar in spirit to Ragel. The DSL program describes Nondeterministic finite automaton states and actions associated with each state transition and matched input byte.

An example of a DSL state definition:

tag_name_state {
   whitespace => ( finish_tag_name?; --> before_attribute_name_state )
   b'/'       => ( finish_tag_name?; --> self_closing_start_tag_state )
   b'>'       => ( finish_tag_name?; emit_tag?; --> data_state )
   eof        => ( emit_raw_without_token_and_eof?; )
   _          => ( update_tag_name_hash; )
}

The DSL program gets expanded by the Rust compiler into not quite as beautiful, but extremely efficient Rust code.

We no longer need to reimplement the code that drives the parsing process for each of our parsers. All we need to do is to define different action implementations for each. In the case of the tag scanner, the majority of these actions are a no-op, so the Rust compiler does the NFA optimization job for us: it optimizes away state branches with no-op actions and even whole states if all of the branches have no-op actions. Now that’s cool.

Byte slice processing optimisations

Moving to a memory-safe language provided new challenges. Rust has great memory safety mechanisms, however sometimes they have a runtime performance cost.

The task of the parser is to scan through the input and find the boundaries of lexical units of the language - tokens and their internal parts. For example, an HTML start tag token consists of multiple parts: a byte slice of input that represents the tag name and multiple pairs of input slices that represent attributes and values:

struct StartTagToken<'i> {
   name: &'i [u8],
   attributes: Vec<(&'i [u8], &'i [u8])>,
   self_closing: bool
}

As Rust uses bound checks on memory access, construction of a token might be a relatively expensive operation. We need to be capable of constructing thousands of them in a fraction of second, so every CPU instruction counts.

Following the principle of doing as little as possible to improve performance we use a “token outline” representation of tokens: instead of having memory slices for token parts we use numeric ranges which are lazily transformed into a byte slice when required.

struct StartTagTokenOutline {
   name: Range<usize>,
   attributes: Vec<(Range<usize>, Range<usize>)>,
   self_closing: bool
}

As you might have noticed, with this approach we are no longer bound to the lifetime of the input chunk which turns out to be very useful. If a start tag is spread across multiple input chunks we can easily update the token that is currently in construction, as new chunks of input arrive by just adjusting integer indices. This allows us to avoid constructing a new token with slices from the new input memory region (it could be the input chunk itself or the internal parser’s buffer).

This time we can’t get away with avoiding the conversion of input character encoding; we expose a user-facing API that operates on JavaScript strings and input HTML can be of any encoding. Luckily, as we can still parse without decoding and only encode and decode within token bounds by a request (though we still can’t do that for UTF-16 encoding).

So, when a user requests an element’s tag name in the API, internally it is still represented as a byte slice in the character encoding of the input, but when provided to the user it gets dynamically decoded. The opposite process happens when a user sets a new tag name.

For selector matching we can still operate on the original encoding representation - because we know the input encoding ahead of time we preemptively convert values in a selector to the page’s character encoding, so comparisons can be done without decoding fields of each token.

As you can see, the new parser architecture along with all these optimizations produced great performance results:

A History of HTML Parsing at Cloudflare: Part 2
Average parsing time depending on the input size - lower is better

LOL HTML’s tag scanner is typically twice as fast as LazyHTML and the lexer has comparable performance, outperforming LazyHTML on bigger inputs. Both are a few times faster than the tokenizer from html5ever - another parser implemented in Rust used in the Mozilla’s Servo browser engine.

CSS selector matching VM

With an impressively fast parser on our hands we had only one thing missing - the CSS selector matcher. Initially we thought we could just use Servo’s CSS selector matching engine for this purpose. After a couple of days of experimentation it turned out that it is not quite suitable for our task.

It did not work well with our dual parser architecture. We first need to to match just a tag name from the tag scanner, and then, if we fail, query the lexer for the attributes. The selectors library wasn’t designed with this architecture in mind so we needed ugly hacks to bail out from matching in case of insufficient information. It was inefficient as we needed to start matching again after the bailout doing twice the work. There were other problems, such as the integration of lazy character decoding and integration of tag name comparison using tag name hashes.

Matching direction

The main problem encountered was the need to backtrack all the open elements for matching. Browsers match selectors from right to left and traverse all ancestors of an element. This StackOverflow has a good explanation of why they do it this way. We would need to store information about all open elements and their attributes - something that we can’t do while operating with tight memory constraints. This matching approach would be inefficient for our case - unlike browsers, we expect to have just a few selectors and a lot of elements. In this case it is much more efficient to match selectors from left to right.

And this is when we had a revelation. Consider the following CSS selector:

body > div.foo  img[alt] > div.foo ul

It can be split into individual components attributed to a particular element with hierarchical combinators in between:

body > div.foo img[alt] > div.foo  ul
---    ------- --------   -------  --

Each component is easy to match having a start tag token - it’s just a matter of comparison of token fields with values in the component. Let’s dive into abstract thinking and imagine that each such component is a character in the infinite alphabet of all possible components:

Selector component Character
body a
div.foo b
img[alt] c
ul d

Let’s rewrite our selector with selector components replaced by our imaginary characters:

a > b c > b d

Does this remind you of something?

A   `>` combinator can be considered a child element, or “immediately followed by”.

The ` ` (space) is a descendant element can be thought of as there might be zero or more elements in between.

There is a very well known abstraction to express these relations - regular expressions. The selector replacing combinators can be replaced with a regular expression syntax:

ab.*cb.*d

We transformed our CSS selector into a regular expression that can be executed on the sequence of start tag tokens. Note that not all CSS selectors can be converted to such a regular grammar and the input on which we match has some specifics, which we’ll discuss later. However, it was a good starting point: it allowed us to express a significant subset of selectors.

Implementing a Virtual Machine

Next, we started looking at non-backtracking algorithms for regular expressions. The virtual machine approach seemed suitable for our task as it was possible to have a non-backtracking implementation that was flexible enough to work around differences between real regular expression matching on strings and our abstraction.

VM-based regular expression matching is implemented as one of the engines in many regular expression libraries such as regexp2 and Rust’s regex. The basic idea is that instead of building an NFA or DFA for a regular expression it is instead converted into DSL assembly language with instructions later executed by the virtual machine - regular expressions are treated as programs that accept strings for matching.

Since the VM program is just a representation of NFA with ε-transitions it can exist in multiple states simultaneously during the execution, or, in other words, spawns multiple threads. The regular expression matches if one or more states succeed.

For example, consider the following VM instructions:

  • expect c - waits for next input character, aborts the thread if doesn’t equal to the instruction’s operand;
  • jmp L - jump to label ‘L’;
  • thread L1, L2 - spawns threads for labels L1 and L2, effectively splitting the execution;
  • match - succeed the thread with a match;

For example, using this instructions set regular expression “ab*c” can be translated into:

    expect a
L1: thread L2, L3
L2: expect b
    jmp L1
L3: expect c
    match

Let’s try to translate the regular expression ab.*cb.*d from the selector we saw earlier:

    expect a
    expect b
L1: thread L2, L3
L2: expect [any]
    jmp L1
L3: expect c
    expect b
L4: thread L5, L6
L5: expect [any]
    jmp L4
L6: expect d
    match

That looks complex! Though this assembly language is designed for regular expressions in general, and regular expressions can be much more complex than our case. For us the only kind of repetition that matters is “.*”. So, instead of expressing it with multiple instructions we can use just one called hereditary_jmp:

    expect a
    expect b
    hereditary_jmp L1
L1: expect c
    expect b
    hereditary_jmp L2
L2: expect d
    match

The instruction tells VM to memoize instruction’s label operand and unconditionally spawn a thread with a jump to this label on each input character.

There is one significant distinction between the string input of regular expressions and the input provided to our VM. The input can shrink!

A regular string is just a contiguous sequence of characters, whereas we operate on a sequence of open elements. As new tokens arrive this sequence can grow as well as shrink. Assume we represent <div> as ‘a’ character in our imaginary language, so having <div><div><div> input we can represent it as aaa, if the next token in the input is </div> then our “string” shrinks to aa.

You might think at this point that our abstraction doesn’t work and we should try something else. What we have as an input for our machine is a stack of open elements and we needed a stack-like structure to store our hereditrary_jmp instruction labels that VM had seen so far. So, why not store it on the open element stack? If we store the next instruction pointer on each of stack items on which the expect instruction was successfully executed, we’ll have a full snapshot of the VM state, so we can easily roll back to it if our stack shrinks.

With this implementation we don’t need to store anything except a tag name on the stack, and, considering that we can use the tag name hashing algorithm, it is just a 64-bit integer per open element. As an additional small optimization, to avoid traversing of the whole stack in search of active hereditary jumps on each new input we store an index of the first ancestor with a hereditary jump on each stack item.

For example, having the following selector “body > div span” we’ll have the following VM program (let’s get rid of labels and just use instruction indices instead):

0| expect <body>
1| expect <div>
2| hereditary_jmp 3
3| expect <span>
4| match

Having an input “<body><div><div><a>” we’ll have the following stack:

A History of HTML Parsing at Cloudflare: Part 2

Now, if the next token is a start tag <span> the VM will first try to execute the selectors program from the beginning and will fail on the first instruction. However, it will also look for any active hereditary jumps on the stack. We have one which jumps to the instructions at index 3. After jumping to this instruction the VM successfully produces a match. If we get yet another <span> start tag later it will much as well following the same steps which is exactly what we expect for the descendant selector.

If we then receive a sequence of “</span></span></div></a></div>” end tags our stack will contain only one item:

A History of HTML Parsing at Cloudflare: Part 2

which instructs VM to jump to instruction at index 1, effectively rolling back to matching the div component of the selector.

We mentioned earlier that we can bail out from the matching process if we only have a tag name from the tag scanner and we need to obtain more information by running the lexer? With a VM approach it is as easy as stopping the execution of the current instruction and resuming it later when we get the required information.

Duplicate selectors

As we need a separate program for each selector we need to match, how can we stop the same simple components doing the same job? The AST for our selector matching program is a radix tree-like structure whose edge labels are simple selector components and nodes are hierarchical combinators.
For example for the following selectors:

body > div > link[rel]
body > span
body > span a

we’ll get the following AST:

A History of HTML Parsing at Cloudflare: Part 2


If selectors have common prefixes we can match them just once for all these selectors. In the compilation process, we flatten this structure into a vector of instructions.

[not] JIT-compilation

For performance reasons compiled instructions are macro-instructions - they incorporate multiple basic VM instruction calls. This way the VM can execute only one macro instruction per input token. Each of the macro instructions compiled using the so-called “[not] JIT-compilation” (the same approach to the compilation is used in our other Rust project - wirefilter).

Internally the macro instruction contains expect and following jmp, hereditary_jmp and match basic instructions. In that sense macro-instructions resemble microcode making it easy to suspend execution of a macro instruction if we need to request attributes information from the lexer.

What’s next

It is obviously not the end of the road, but hopefully, we’ve got a bit closer to it. There are still multiple bits of functionality that need to be implemented and certainly, there is a space for more optimizations.

If you are interested in the topic don’t hesitate to join us in development of LazyHTML and LOL HTML at GitHub and, of course, we are always happy to see people passionate about technology here at Cloudflare, so don’t hesitate to contact us if you are too :).

Thursday, 28 November

08:13

Saturday Morning Breakfast Cereal - Closure [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I will always think of you as the one who got away but who can still receive my daily texts.


Today's News:

01:44

A History of HTML Parsing at Cloudflare: Part 1 [The Cloudflare Blog]

A History of HTML Parsing at Cloudflare: Part 1
A History of HTML Parsing at Cloudflare: Part 1

To coincide with the launch of streaming HTML rewriting functionality for Cloudflare Workers we are open sourcing the Rust HTML rewriter (LOL HTML) used to back the Workers HTMLRewriter API. We also thought it was about time to review the history of HTML rewriting at Cloudflare.

The first blog post will explain the basics of a streaming HTML rewriter and our particular requirements. We start around 8 years ago by describing the group of ‘ad-hoc’ parsers that were created with specific functionality such as to rewrite e-mail addresses or minify HTML. By 2016 the state machine defined in the HTML5 specification could be used to build a single spec-compliant HTML pluggable rewriter, to replace the existing collection of parsers. The source code for this rewriter is now public and available here: https://github.com/cloudflare/lazyhtml.

The second blog post will describe the next iteration of rewriter. With the launch of the edge compute platform Cloudflare Workers we came to realise that developers wanted the same HTML rewriting capabilities with a JavaScript API. The post describes the thoughts behind a low latency streaming HTML rewriter with a CSS-selector based API. We open-sourced the Rust library as it can also be used as a stand-alone HTML rewriting/parsing library.

What is a streaming HTML rewriter ?

A streaming HTML rewriter takes either a HTML string or byte stream input, parses it into tokens or any other structured intermediate representation (IR) - such as an Abstract Syntax Tree (AST). It then performs transformations on the tokens before converting back to HTML. This provides the ability to modify, extract or add to an existing HTML document as the bytes are being processed. Compare this with a standard HTML tree parser which needs to retrieve the entire file to generate a full DOM tree. The tree-based rewriter will both take longer to deliver the first processed bytes and require significantly more memory.

A History of HTML Parsing at Cloudflare: Part 1
HTML rewriter

For example; consider you own a large site with a lot of historical content that you want to now serve over HTTPS. You will quickly run into the problem of resources (images, scripts, videos) being served over HTTP. This ‘mixed content’ opens a security hole and browsers will warn or block these resources. It can be difficult or even impossible to update every link on every page of a website. With a streaming HTML rewriter you can select the URI attribute of any HTML tag and change any HTTP links to HTTPS. We built this very feature Automatic HTTPS rewrites back in 2016 to solve mixed content issues for our customers.

The reader may already be wondering: “Isn’t this a solved problem, aren’t there many widely used open-source browsers out there with HTML parsers that can be used for this purpose?”. The reality is that writing code to run in 190+ PoPs around the world with a strict low latency requirement turns even seemingly trivial problems into complex engineering challenges.

The following blog posts will detail the journey of how starting with a simple idea of finding email addresses within an HTML page led to building an almost spec compliant HTML parser and then on to a CSS selector matching Virtual Machine. We learned a lot on this journey. I hope you find some of this as interesting as we did.

Rewriting at the edge

When rewriting content through Cloudflare we do not want to impact site performance. The balance in designing a streaming HTML rewriter is to minimise the pause in response byte flow by holding onto as little information as possible whilst retaining the ability to rewrite matching tokens.

The difference in requirements compared to an HTML parser used in a browser include:

Output latency

For browsers, the Document Object Model (DOM) is the end product of the parsing process but in our case we have to parse, rewrite and serialize back to HTML. In the case of Cloudflare’s reverse proxy any content processing on the edge server results in latency between the server and an eyeball. It is desirable to minimize the latency impact of HTML handling, which involves parsing, rewriting and serializing back to HTML. In all of these stages we want to be as fast as possible to minimize latency.

Parser throughput

Let’s assume that usually browsers rarely need to deal with HTML pages bigger than 1Mb in size and an average page load time is somewhere around 3s at best. HTML parsing is not the main bottleneck of the page loading process as the browser will be blocked on running scripts and loading other render-critical resources. We can roughly estimate that ~3Mbps is an acceptable throughput for browser’s HTML parser. At Cloudflare we have hundreds of megabytes of traffic per CPU, so we need a parser that is faster by an order of magnitude.

Memory limitations

As most users must realise, browsers have the luxury of being able to consume memory. For example, this simple HTML markup when opened in a browser will consume a significant chunk of your system memory before eventually killing a browser tab (and all this memory will be consumed by the parser) :

<script>
   document.write('<');
   while(true) {
      document.write('aaaaaaaaaaaaaaaaaaaaaaaa');
   }
</script>

Unfortunately, buffering of some fraction of the input is inevitable even for streaming HTML rewriting. Consider these 2 HTML snippets:

<div foo="bar" qux="qux">
<div foo="bar" qux="qux"


These seemingly similar fragments of HTML will be treated completely differently when encountered at the end of an HTML page. The first fragment will be parsed as a start tag and the second one will be ignored. By just seeing a `<` character followed by a tag name, the parser can’t determine if it has found a start tag or not. It needs to traverse the input in the search of the closing `>` to make a decision, buffering all content in between, so it can later be emitted to the consumer as a start tag token.

This requirement forces browsers to indefinitely buffer content before eventually giving up with the out-of-memory error.

In our case, we can’t afford to spend hundreds of megabytes of memory parsing a single HTML file (actual constraints are even tighter - even using a dozen kilobytes for each request would be unacceptable). We need to be much more sophisticated than other implementations in terms of memory usage and gracefully handle all the situations where provided memory capacity is insufficient to accomplish parsing.

v0 : “Ad-hoc parsers”

As usual with big projects, it all started pretty innocently.

Find and obfuscate an email

In 2010, Cloudflare decided to provide a feature that would stop popular email scrapers. The basic idea of this protection was to find and obfuscate emails on pages and later decode them back in the browser with injected JavaScript code. Sounds easy, right? You search for anything that looks like an email, encode it and then decode it with some JavaScript magic and present the result to the end-user.

However, even such a seemingly simple task already requires solving several issues. First of all, we need to define what an email is, and there is no simple answer. Even the infamous regex supposedly covering the entire RFC is, in fact, outdated and incomplete as the new RFC added lots of valid email constructions, including Unicode support. Let’s not go down that rabbit hole for now and instead focus on a higher-level issue: transforming streaming content.

Content from the network comes in packets, which have to be buffered and parsed as HTTP by our servers. You can’t predict how the content will be split, which means you always need to buffer some of it because content that is going to be replaced can be present in multiple input chunks.

Let’s say we decided to go with a simple regex like `[\w.]+@[\w.]+`. If the content that comes through contains the email “test@example.org”, it might be split in the following chunks:

A History of HTML Parsing at Cloudflare: Part 1

In order to keep good Time To First Byte (TTFB) and consistent speed, we want to ensure that the preceding chunk is emitted as soon as we determine that it’s not interesting for replacement purposes.

The easiest way to do that is to transform our regex into a state machine, or a finite automata. While you could do that by hand, you will end up with hard-to-maintain and error-prone code. Instead, Ragel was chosen to transform regular expressions into efficient native state machine code. Ragel doesn’t try to take care of buffering or anything other than traversing the state machine. It provides a syntax that not only describes patterns, but can also associate custom actions (code in a host language) with any given state.

In our case we can pass through buffers until we match the beginning of an email. If we subsequently find out the pattern is not an email we can bail out from buffering as soon as the pattern stops matching. Otherwise, we can retrieve the matched email and replace it with new content.

To turn our pattern into a streaming parser we can remember the position of the potential start of an email and, unless it was already discarded or replaced by the end of the current input, store the unhandled part in a permanent buffer. Then, when a new chunk comes, we can process it separately, resuming from a state Ragel remembers itself, but then use both the buffered chunk and a new one to either emit or obfuscate.

Now that we have solved the problem of matching email patterns in text, we need to deal with the fact that they need to be obfuscated on pages. This is when the first hints of HTML “parsing” were introduced.

I’ve put “parsing” in quotes because, rather than implementing the whole parser, the email filter (as the module was called) didn’t attempt to replicate the whole HTML grammar, but rather added custom Ragel patterns just for skipping over comments and tags where emails should not be obfuscated.

This was a reasonable approach, especially back in 2010 - four years before the HTML5 specification, when all browsers had their own quirks handling of HTML. However, as you can imagine, this approach did not scale well. If you’re trying to work around quirks in other parsers, you start gaining more and more quirks in your own, and then work around these too. Simultaneously, new features started to be added, which also required modifying HTML on the fly (like automatic insertion of Google Analytics script), and an existing module seemed to be the best place for that. It grew to handle more and more tags, operations and syntactic edge cases.

Now let’s minify..

In 2011, Cloudflare decided to also add minification to allow customers to speed up their websites even if they had not employed minification themselves. For that, we decided to use an existing streaming minifier - jitify. It already had NGINX bindings, which made it a great candidate for integration into the existing pipeline.

Unfortunately, just like most other parsers from that time as well as ours described above, it had its own processing rules for HTML, JavaScript and CSS, which weren’t precise but rather tried to parse content on a best-effort basis. This led to us having two independent streaming parsers that were incompatible and could produce bugs either individually or only in combination.

v1 : "(Almost) HTML5 Spec compliant parser"

Over the years engineers kept adding new features to the ever-growing state machines, while fixing new bugs arising from imprecise syntax implementations, conflicts between various parsers, and problems in features themselves.

By 2016, it was time to get out of the multiple ad hoc parsers business and do things ‘the right way’.

The next section(s) will describe how we built our HTML5 compliant parser starting from the specification state machine. Using only this state machine it should have been straight-forward to build a parser. You may be aware that historically the parsing of HTML had not been entirely strict which meant to not break existing implementations the building of an actual DOM was required for parsing. This is not possible for a streaming rewriter so a simulator of the parser feedback was developed. In terms of performance, it is always better not to do something. We then describe why the rewriter can be ‘lazy’ and not perform the expensive encoding and decoding of text when rewriting HTML. The surprisingly difficult problem of deciding if a response is HTML is then detailed.

HTML5

By 2016, HTML5 had defined precise syntax rules for parsing and compatibility with legacy content and custom browser implementations. It was already implemented by all browsers and many 3rd-party implementations.

The HTML5 parsing specification defines basic HTML syntax in the form of a state machine. We already had experience with Ragel for similar use cases, so there was no question about what to use for the new streaming parser. Despite the complexity of the grammar, the translation of the specification to Ragel syntax was straightforward. The code looks simpler than the formal description of the state machine, thanks to the ability to mix regex syntax with explicit transitions.

A History of HTML Parsing at Cloudflare: Part 1
A visualisation of a small fraction of the HTML state machine. Source: https://twitter.com/RReverser/status/715937136520916992

HTML5 parsing requires a ‘DOM’

However, HTML has a history. To not break existing implementations HTML5 is specified with recovery procedures for incorrect tag nesting, ordering, unclosed tags, missing attributes and all the other possible quirks that used to work in older browsers. In order to resolve these issues, the specification expects a tree builder to drive the lexer, essentially meaning you can’t correctly tokenize HTML (split into separate tags) without a DOM.

A History of HTML Parsing at Cloudflare: Part 1
HTML parsing flow as defined by the specification

For this reason, most parsers don’t even try to perform streaming parsing and instead take the input as a whole and produce a document tree as an output. This is not something we could do for streaming transformation without adding significant delays to page loading.

An existing HTML5 JavaScript parser - parse5 - had already implemented spec-compliant tree parsing using a streaming tokenizer and rewriter. To avoid having to create a full DOM the concept of a “parser feedback simulator” was introduced.

Tree builder feedback

As you can guess from the name, this is a module that aims to simulate a full parser’s feedback to the tokenizer, without actually building the whole DOM, but instead preserving only the required information and context necessary for correctly driving the state machine.

After rigorous testing and upstreaming a test runner to parse5, we found this technique to be suitable for the majority of even poorly written pages on the Internet, and employed it in LazyHTML.

A History of HTML Parsing at Cloudflare: Part 1
LazyHTML architecture

Avoiding decoding - everything is ASCII

Now that we had a streaming tokenizer working, we wanted to make sure that it was fast enough so that users didn’t notice any slowdowns to their pages as they go through the parser and transformations. Otherwise it would completely circumvent any optimisations we’d want to attempt on the fly.

It would not only cause a performance hit due to decoding and re-encoding any modified HTML content, but also significantly complicates our implementation due to multiple sources of potential encoding information required to determine the character encoding, including sniffing of the first 1 KB of the content.

The “living” HTML Standard specification permits only encodings defined in the Encoding Standard. If we look carefully through those encodings, as well as a remark on Character encodings section of the HTML spec, we find that all of them are ASCII-compatible with the exception of UTF-16 and ISO-2022-JP.

This means that any ASCII text will be represented in such encodings exactly as it would be in ASCII, and any non-ASCII text will be represented by bytes outside of the ASCII range. This property allows us to safely tokenize, compare and even modify original HTML without decoding or even knowing which particular encoding it contains. It is possible as all the token boundaries in HTML grammar are represented by an ASCII character.

We need to detect UTF-16 by sniffing and either decode or skip such documents without modification. We chose the latter to avoid potential security-sensitive bugs which are common with UTF-16, and because the character encoding is seen in less than 0.1% of known character encodings luckily.

The only issue left with this approach is that in most places the HTML tokenization specification requires you to replace U+0000 (NUL) characters with U+FFFD (replacement character) during parsing. Presumably, this was added as a security precaution against bugs in C implementations of old engines which could treat NUL character, encoded in ASCII / UTF-8 / ... as a 0x00 byte, as the end of the string (yay, null-terminated strings…). It’s problematic for us because U+FFFD is outside of the ASCII range, and will be represented by different sequences of bytes in different encodings. We don’t know the encoding of the document, so this will lead to corruption of the output.

Luckily, we’re not in the same business as browser vendors, and don’t worry about NUL characters in strings as much - we use “fat pointer” string representation, in which the length of the string is determined not by the position of the NUL character, but stored along with the data pointer as an integer field:

typedef struct {
   const char *data;
   size_t length;
} lhtml_string_t;

Instead, we can quietly ignore these parts of the spec (sorry!), and keep U+0000 characters as-is and add them as such to tag, attribute names, and other strings, and later re-emit to the document. This is safe to do, because it doesn’t affect any state machine transitions, but merely preserves original 0x00 bytes and delegates their replacement to the parser in the end user’s browser.

Content type madness

We want to be lazy and minimise false positives. We only want to spend time parsing, decoding and rewriting actual HTML rather than breaking images or JSON. So the question is how do you decide if something is a HTML document. Can you just use the Content-Type for example ? A comment left in the source code best describes the reality.

/*
Dear future generations. I didn't like this hack either and hoped
we could do the right thing instead. Unfortunately, the Internet
was a bad and scary place at the moment of writing. If this
ever changes and websites become more standards compliant,
please do remove it just like I tried.
Many websites use PHP which sets Content-Type: text/html by
default. There is no error or warning if you don't provide own
one, so most websites don't bother to change it and serve
JSON API responses, private keys and binary data like images
with this default Content-Type, which we would happily try to
parse and transforms. This not only hurts performance, but also
easily breaks response data itself whenever some sequence inside
it happens to look like a valid HTML tag that we are interested
in. It gets even worse when JSON contains valid HTML inside of it
and we treat it as such, and append random scripts to the end
breaking APIs critical for popular web apps.
This hack attempts to mitigate the risk by ensuring that the
first significant character (ignoring whitespaces and BOM)
is actually `<` - which increases the chances that it's indeed HTML.
That way we can potentially skip some responses that otherwise
could be rendered by a browser as part of AJAX response, but this
is still better than the opposite situation.
*/

The reader might think that it’s a rare edge case, however, our observations show that almost 25% of the traffic served through Cloudflare with the “text/html” content type is unlikely to be HTML.

A History of HTML Parsing at Cloudflare: Part 1

The trouble doesn’t end there: it turns out that there is a considerable amount of XML content served with the “text/html” content type which can’t be always processed correctly when treated as HTML.

Over time bailouts for binary data, JSON, AMP and correctly identifying HTML fragments leads to the content sniffing logic which can be described by the following diagram:

A History of HTML Parsing at Cloudflare: Part 1

This is a good example of divergence between formal specifications and reality.

Tag name comparison optimisation

But just having fast parsing is not enough - we have functionality that consumes the output of the parser, rewrites it and feeds it back for the serialization. And all the memory and time constraints that we have for the parser are applicable for this code as well, as it is a part of the same content processing pipeline.

It’s a common requirement to compare parsed HTML tag names, e.g. to determine if the current tag should be rewritten or not. A naive implementation will use regular per-byte comparison which can require traversing the whole tag name. We were able to narrow this operation to a single integer comparison instruction in the majority of cases by using specially designed hashing algorithm.

The tag names of all standard HTML elements contain only alphabetical ASCII characters and digits from 1 to 6 (in numbered header tags, i.e. <h1> - <h6>). Comparison of tag names is case-insensitive, so we only need 26 characters to represent alphabetical characters. Using the same basic idea as arithmetic coding, we can represent each of the possible 32 characters of a tag name using just 5 bits and, thus, fit up to floor(64 / 5) = 12 characters in a 64-bit integer which is enough for all the standard tag names and any other tag names that satisfy the same requirements! The great part is that we don’t even need to additionally traverse a tag name to hash it - we can do that as we parse the tag name consuming the input byte by byte.

However, there is one problem with this hashing algorithm and the culprit is not so obvious: to fit all 32 characters in 5 bits we need to use all possible bit combinations including 00000. This means that if the leading character of the tag name is represented with 00000 then we will not be able to differentiate between a varying number of consequent repetitions of this character.

For example, considering that ‘a’ is encoded as 00000 and ‘b’ as 00001 :

Tag name Bit representation Encoded value
ab 00000 00001 1
aab 00000 00000 00001 1

Luckily, we know that HTML grammar doesn’t allow the first character of a tag name to be anything except an ASCII alphabetical character, so reserving numbers from 0 to 5 (00000b-00101b) for digits and numbers from 6 to 31 (00110b - 11111b) for ASCII alphabetical characters solves the problem.

LazyHTML

After taking everything mentioned above into consideration the LazyHTML (https://github.com/cloudflare/lazyhtml) library was created. It is a fast streaming HTML parser and serializer with a token based C-API derived from the HTML5 lexer written in Ragel. It provides a pluggable transformation pipeline to allow multiple transformation handlers to be chained together.

An example of a function that transforms `href` property of links:

// define static string to be used for replacements
static const lhtml_string_t REPLACEMENT = {
   .data = "[REPLACED]",
   .length = sizeof("[REPLACED]") - 1
};

static void token_handler(lhtml_token_t *token, void *extra /* this can be your state */) {
  if (token->type == LHTML_TOKEN_START_TAG) { // we're interested only in start tags
    const lhtml_token_starttag_t *tag = &token->start_tag;
    if (tag->type == LHTML_TAG_A) { // check whether tag is of type <a>
      const size_t n_attrs = tag->attributes.count;
      const lhtml_attribute_t *attrs = tag->attributes.items;
      for (size_t i = 0; i < n_attrs; i++) { // iterate over attributes
        const lhtml_attribute_t *attr = &attrs[i];
        if (lhtml_name_equals(attr->name, "href")) { // match the attribute name
          attr->value = REPLACEMENT; // set the attribute value
        }
      }
    }
  }
  lhtml_emit(token, extra); // pass transformed token(s) to next handler(s)
}

So, is it correct and how fast is it?

It is HTML5 compliant as tested against the official test suites. As part of the work several contributions were sent to the specification itself for clarification / simplification of the spec language.

Unlike the previous parser(s), it didn't bail out on any of the 2,382,625 documents from HTTP Archive, although 0.2% of documents exceeded expected bufferization limits as they were in fact JavaScript or RSS or other types of content incorrectly served with Content-Type: text/html, and since anything is valid HTML5, the parser tried to parse e.g. a<b; x=3; y=4 as incomplete tag with attributes. This is very rare (and goes to even lower amount of 0.03% when two error-prone advertisement networks are excluded from those results), but still needs to be accounted for and is a valid case for bailing out.

As for the benchmarks, In September 2016 using an example which transforms the HTML spec itself (7.9 MB HTML file) by replacing every <a href> (only that property only in those tags) to a static value. It was compared against the few existing and popular HTML parsers (only tokenization mode was used for the fair comparison, so that they don't need to build AST and so on), and timings in milliseconds for 100 iterations are the following (lazy mode means that we're using raw strings whenever possible, the other one serializes each token just for comparison):

A History of HTML Parsing at Cloudflare: Part 1

The results show that LazyHTML parser speeds are around an order of magnitude faster.

That concludes the first post in our series on HTML rewriters at Cloudflare. The next post describes how we built a new streaming rewriter on top of the ideas of LazyHTML. The major update was to provide an easier to use CSS selector API. It provides the back-end for the Cloudflare workers HTMLRewriter JavaScript API.

01:00

Introducing the HTMLRewriter API to Cloudflare Workers [The Cloudflare Blog]

Introducing the HTMLRewriter API to Cloudflare Workers
Introducing the HTMLRewriter API to Cloudflare Workers

We are excited to announce that the HTMLRewriter API for Cloudflare Workers is now GA! You can get started today by checking out our documentation, or trying out our tutorial for localizing your site with the HTMLRewriter.

Want to know how it works under the hood? We are excited to tell you everything you wanted to know but were afraid to ask, about building a streaming HTML parser on the edge; read about it in part 1 (and stay tuned for part two coming tomorrow!).

Faster, more scalable applications at the edge

The HTMLRewriter can help solve two big problems web developers face today: making changes to the HTML, when they are hard to make at the server level, and making it possible for HTML to live on the edge, closer to the user — without sacrificing dynamic functionality.

Since the introduction of Workers, Workers have helped customers regain control where control either wasn’t provided, or very hard to obtain at the origin level. Just like Workers can help you set CORS headers at the middleware layer, between your users and the origin, the HTMLRewriter can assist with things like URL rewrites (see the example below!).

Back in January, we introduced the Cache API, giving you more control than ever over what you could store in our edge caches, and how. By helping users have closer control of the cache, users could push experiences that were previously only able to live at the origin server to the network edge. A great example of this is the ability to cache POST requests, where previously only GET requests were able to benefit from living on the edge.

Later this year, during Birthday Week, we announced Workers Sites, taking the idea of content on the edge one step further, and allowing you to fully deploy your static sites to the edge.

We were not the first to realize that the web could benefit from more content being accessible from the CDN level. The motivation behind the resurgence of static sites is to have a canonical version of HTML that can easily be cached by a CDN.

This has been great progress for static content on the web, but what about so much of the content that’s almost static, but not quite? Imagine an e-commerce website: the items being offered, and the promotions are all the same — except that pesky shopping cart on the top right. That small difference means the HTML can no longer be cached, and sent to the edge. That tiny little number of items in the cart requires you to travel all the way to your origin, and make a call to the database, for every user that visits your site. This time of year, around Black Friday, and Cyber Monday, this means you have to make sure that your origin, and database are able to scale to the maximum load of all the excited shoppers eager to buy presents for their loved ones.

Edge-side JAMstack with HTMLRewriter

One way to solve this problem of reducing origin load while retaining dynamic functionality is to move more of the logic to the client — this approach of making the HTML content static, leveraging CDN for caching, and relying on client-side API calls for dynamic content is known as the JAMstack (JavaScript, APIs, Markdown). The JAMstack is certainly a move in the right direction, trying to maximize content on the edge. We’re excited to embrace that approach even further, by allowing dynamic logic of the site to live on the edge as well.

In September, we introduced the HTMLRewriter API in beta to the Cloudflare Workers runtime — a streaming parser API to allow developers to develop fully dynamic applications on the edge. HTMLRewriter is a jQuery-like experience inside your Workers application, allowing you to manipulate the DOM using selectors and handlers. The same way jQuery revolutionized front-end development, we’re excited for the HTMLRewriter to change how we think about architecting dynamic applications.

There are a few advantages to rewriting HTML on the edge, rather than on the client. First, updating content client-side introduces a tradeoff between the performance and consistency of the application — that is, if you want to, update a page to a logged in version client side, the user will either be presented with the static version first, and witness the page update (resulting in an inconsistency known as a flicker), or the rendering will be blocked until the customization can be fully rendered. Having the DOM modifications made edge-side provides the benefits of a scalable application, without the degradation of user experience. The second problem is the inevitable client-side bloat. Most of the world today is connected to the internet on a mobile phone with significantly less powerful CPU, and on shaky last-mile connections where each additional API call risks never making it back to the client. With the HTMLRewriter within Cloudflare Workers, the API calls for dynamic content, (or even calls directly to Workers KV for user-specific data) can be made from the Worker instead, on a much more powerful machine, over much more resilient backbone connections.

Here’s what the developers at Happy Cog have to say about HTMLRewriter:

"At Happy Cog, we're excited about the JAMstack and static site generators for our clients because of their speed and security, but in the past have often relied on additional browser-based JavaScript and internet-exposed backend services to customize user experiences based on a cookied logged-in state. With Cloudflare's HTMLRewriter, that's now changed. HTMLRewriter is a fundamentally new powerful capability for static sites. It lets us parse the request and customize each user's experience without exposing any backend services, while continuing to use a JAMstack architecture. And it's fast. Because HTMLRewriter streams responses directly to users, our code can make changes on-the-fly without incurring a giant TTFB speed penalty. HTMLRewriter is going to allow us to make architectural decisions that benefit our clients and their users -- while avoiding a lot of the tradeoffs we'd usually have to consider."

Matt Weinberg, - President of Development and Technology, Happy Cog

HTMLRewriter in Action

Since most web developers are familiar with jQuery, we chose to keep the HTMLRewriter very similar, using selectors and handlers to modify content.

Below, we have a simple example of how to use the HTMLRewriter to rewrite the URLs in your application from myolddomain.com to mynewdomain.com:

async function handleRequest(req) {
  const res = await fetch(req)
  return rewriter.transform(res)
}
 
const rewriter = new HTMLRewriter()
  .on('a', new AttributeRewriter('href'))
  .on('img', new AttributeRewriter('src'))
 
class AttributeRewriter {
  constructor(attributeName) {
    this.attributeName = attributeName
  }
 
  element(element) {
    const attribute = element.getAttribute(this.attributeName)
    if (attribute) {
      element.setAttribute(
        this.attributeName,
        attribute.replace('myolddomain.com', 'mynewdomain.com')
      )
    }
  }
}
 
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

The HTMLRewriter, which is instantiated once per Worker, is used here to select the a and img elements.

An element handler responds to any incoming element, when attached using the .on function of the HTMLRewriter instance.

Here, our AttributeRewriter will receive every element parsed by the HTMLRewriter instance. We will then use it to rewrite each attribute, href for the a element for example, to update the URL in the HTML.

Getting started

You can get started with the HTMLRewriter by signing up for Workers today. For a full guide on how to use the HTMLRewriter API, we suggest checking out our documentation.

How does the HTMLRewriter work?

How does the HTMLRewriter work under the hood? We’re excited to tell you about it, and have spared no details. Check out the first part of our two-blog post series to learn all about the not-so-brief history of rewriter HTML at Cloudflare, and stay tuned for part two tomorrow!

Wednesday, 27 November

10:30

Harnessing the Power of the People: Cloudflare’s First Security Awareness Month Design Challenge Winners [The Cloudflare Blog]

Harnessing the Power of the People: Cloudflare’s First Security Awareness Month Design Challenge Winners

Grabbing the attention of employees at a security and privacy-focused company on security awareness presents a unique challenge; how do you get people who are already thinking about security all day to think about it some more? October marked Cloudflare’s first Security Awareness Month as a public company and to celebrate, the security team challenged our entire company population to create graphics, slogans, and memes to encourage us all to think and act more securely every day.

Employees approached this challenge with gusto; global participation meant plenty of high quality submissions to vote on. In addition to being featured here, the winning designs will be displayed in Cloudflare offices throughout 2020 and the creators will be on the decision panel for next year’s winners. Three rose to the top, highlighting creativity and style that is uniquely Cloudflarian. I sat down with the winners to talk through their thoughts on security and what all companies can do to drive awareness.

Eugene Wang, Design Team, First Place

Harnessing the Power of the People: Cloudflare’s First Security Awareness Month Design Challenge Winners

Sílvia Flores, Executive Assistant, Second Place

Harnessing the Power of the People: Cloudflare’s First Security Awareness Month Design Challenge Winners

Scott Jones, e-Learning Developer, Third Place

Security Haiku

Wipe that whiteboard clean‌‌
Visitors may come and see
Secrets not for them

No tailgating please
You may be a nice person
But I don’t know that‌‌‌‌‌‌

1. What inspired your design?

Eugene: The friendly "Welcome" cloud seen in our all company slides was a jumping off point. It seemed like a great character that embodied being a Cloudflarian and had tons of potential to get into adventures. I also wanted security to be a bit fun, where appropriate. Instead of a serious breach (though it could be), here it was more a minor annoyance personified by a wannabe-sneaky alligator. Add a pun, and there you go—poster design!

Sílvia: What inspired my design was the cute Cloudflare mascot the otter since there are so many otters in SF. Also, security can be fun and I added a pun for all the employees to remember the security system in an entertaining and respectful way. This design is very much my style and I believe making things cute and bright can really grab attention from people who are so busy in their work. A bright, orange, leopard print poster cannot be missed!

Scott: I have always loved the haiku form and poems were allowed!

2. What's the number one thing security teams can do to get non-security people excited about security?

Eugene: Make them realize and identify the threats that can happen everyday, and their role in keeping things secure. Cute characters and puns help.

Sílvia: Make it more accessible for people to engage and understand it, possibly making more activities, content, and creating a fun environment for people to be aware but also be mindful.

Scott: Use whatever means available to keep the idea of being security conscious in everyone's active awareness. This can and should be done in a variety of different ways so as to engage everyone in one way or another, visually with posters and signs, mentally by having contests, multi-sensory through B.E.E.R. meeting presentations and yes, even through a careful use of fear by periodically giving examples of what can happen if security is not followed...I believe that people like working here and believe in what we are doing and how we are doing it, so awareness mixed in with a little fear can reach people on a more visceral and personal level.

3. What's your favorite security tip?

Eugene: Look at the destination of the return email.

Sílvia: LastPass. Oh my lord. I cannot remember one single password since we need to make them so difficult! With numbers, caps, symbols, emojis (ahaha). LastPass makes it easier for me to be secure and still be myself and not remembering any password without freaking out.

Scott: “See something, say something" because it both reflects our basic responsibility to each other and exhibits a pride that we have as being part of a company we believe in and want to protect.‌‌

‌‌For security practitioners and engagement professionals, it's easy to try to boil the ocean when Security Awareness Month comes around. The list of potential topics and guidance is endless. Focusing on two or three key messages, gauging the maturity of your organization, and encouraging company-wide participation makes it a company-wide effort. Extra recognition and glory for those that go over and above never hurts either.

Want to run a security awareness design contest at your company? Reach out to us at securityawareness@cloudflare.com for tips and best practices for getting started, garnering support, and encouraging participation.

‌‌

08:38

Saturday Morning Breakfast Cereal - Sunset [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
With the right assumptions, I think you could get to a negative number.


Today's News:

01:00

Create virtual machines with Cockpit in Fedora [Fedora Magazine]

This article shows you how to install the software you need to use Cockpit to create and manage virtual machines on Fedora 31. Cockpit is an interactive admin interface that lets you access and manage systems from any supported web browser. With virt-manager being deprecated users are encouraged to use Cockpit instead, which is meant to replace it.

Cockpit is an actively developed project, with many plugins available that extend how it works. For example, one such plugin is “Machines,” which interacts with libvirtd and lets users create and manage virtual machines.

Installing software

The required software prerequisites are libvirt, cockpit and cockpit-machines. To install them on Fedora 31, run the following command from a terminal using sudo:

$ sudo dnf install libvirt cockpit cockpit-machines

Cockpit is also included as part of the “Headless Management” package group. This group is useful for a Fedora based server that you only access through a network. In that case, to install it, use this command:

$ sudo dnf groupinstall "Headless Management"

Setting up Cockpit services

After installing the necessary packages it’s time to enable the services. The libvirtd service runs the virtual machines, while Cockpit has a socket activated service to let you access the Web GUI:

$ sudo systemctl enable libvirtd --now
$ sudo systemctl enable cockpit.socket --now

This should be enough to run virtual machines and manage them through Cockpit. Optionally, if you want to access and manage your machine from another device on your network, you need to expose the service to the network. To do this, add a new rule in your firewall configuration:

$ sudo firewall-cmd --zone=public --add-service=cockpit --permanent
$ sudo firewall-cmd --reload

To confirm the services are running and no issues occurred, check the status of the services:

$ sudo systemctl status libvirtd
$ sudo systemctl status cockpit.socket

At this point everything should be working. The Cockpit web GUI should be available at https://localhost:9090 or https://127.0.0.1:9090. Or, enter the local network IP in a web browser on any other device connected to the same network. (Without SSL certificates setup, you may need to allow a connection from your browser.)

Creating and installing a machine

Log into the interface using the user name and password for that system. You can also choose whether to allow your password to be used for administrative tasks in this session.

Select Virtual Machines and then select Create VM to build a new box. The console gives you several options:

  • Download an OS using Cockpit’s built in library
  • Use install media already downloaded on the system you’re managing
  • Point to a URL for an OS installation tree
  • Boot media over the network via the PXE protocol

Enter all the necessary parameters. Then select Create to power up the new virtual machine.

At this point, a graphical console appears. Most modern web browsers let you use your keyboard and mouse to interact with the VM console. Now you can complete your installation and use your new VM, just as you would via virt-manager in the past.


Photo by Miguel Teixeira on Flickr (CC BY-SA 2.0).

Tuesday, 26 November

08:55

Saturday Morning Breakfast Cereal - Dear [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Just to stop the loophole, Jesus changed his name to Steve-us.


Today's News:

Monday, 25 November

09:08

Saturday Morning Breakfast Cereal - Flash [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Sometimes at night he wonders where the locus of his conscious being is and whether it's near his butt.


Today's News:

Sunday, 24 November

23:43

Welcoming our new Fedora Community Action and Impact Coordinator [Fedora Magazine]

Good news, everybody! I’m pleased to announce that we have completed our search for a new Fedora Community Action and Impact Coordinator, and she’ll be joining the Open Source Program Office (OSPO) team to work with Fedora as of today. Please give a warm welcome to Marie Nordin.

If you’ve been involved in Fedora, you may have already been working with Marie. She’s a member of the Fedora Design and Badges teams. Her latest contribution to the Design Team is the wallpaper for F31, a collaboration with Máirín Duffy. Marie has made considerable contributions to the Badges project. She has designed over 150 badge designs, created documentation and a style guide, and mentored new design contributors for years. Most recently she has been spear-heading a bunch of work related to bringing badges up to date on both the development and UI/UX of the web app.

Marie is new to Red Hat, joining us after 5 years of involvement with the Fedora community. She was first introduced to Fedora through an Outreachy internship in 2013 working on Fedora Badges. Marie’s most current full time position was in the distribution industry as a purchasing agent, bid coordinator, and manager. She also has a strong background in design outside of her efforts for Fedora, working as a freelance graphic designer for the past 8 years.

I believe that Marie’s varied background in business and administration, her experience with design, and her long term involvement with and passion for Fedora makes her an excellent fit for this position. I’m excited to work with her as both a colleague on her team at Red Hat and as a Fedora contributor.

Feel free to reach out with congratulations, but give her a bit to get fully engaged with Fedora duties.

Congratulations, Marie!

09:19

Saturday Morning Breakfast Cereal - Mimic [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
You may have friends who've been doing this for years.


Today's News:

08:30

I’ll have books and merch for sale and am happy to sign anything... [Sarah's Scribbles]



I’ll have books and merch for sale and am happy to sign anything you bring! See you soon!

Saturday, 23 November

09:07

Saturday Morning Breakfast Cereal - Clouds [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
One of these days I need to do a whole book with just comics about how life will be better once humans are gone.


Today's News:

Friday, 22 November

07:02

Saturday Morning Breakfast Cereal - Should [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
They're planning to become evil after they get emeritus status, but by then they'll be too tired.


Today's News:

01:00

Sharing Fedora [Fedora Magazine]

After being a Fedora user for a while, you may have come to enjoy it. And in fact you might want to encourage others to try Fedora. You don’t need any special privileges or to become a Fedora Ambassador to do that. As it turns out, anyone can help others get started with Fedora just by sharing information about it.

Having the conversation

For example, if you go out to lunch with a group of colleagues periodically, you might find it natural to talk about Fedora with them. If someone shows interest, you can suggest to get together with them for a Fedora show and tell. There isn’t any need for formal presentations or prepared talks. This is just having lunch and sharing information with people you know.

When you’re with friends, relatives, colleagues, or neighbors, conversation often turns to things computer related, and you can bring up Fedora. There are usually opportunities to point out how Fedora would partially if not completely address their concerns or provide something they want.

These are people you know so talking with them is easy and natural. You probably know the kind of things they use PCs for, so you know the features of Fedora that will be attractive to them. Such conversations can start anytime you see someone you know. You don’t need to steer conversations toward Fedora — that might be impolite, depending on the situation. But if they bring up computer related issues, you might find an opportunity to talk about Fedora.

Taking action

If a friend or colleague has an unused laptop, you could offer to show them how easy it is to load Fedora. You can also point out that there’s no charge and that the licenses are friendly to users. Sharing a USB key or a DVD is almost always helpful.

When you have someone setup to use Fedora, make sure they have the URLs for discussions, questions, and other related websites. Also, from time to time, let them know if you’ve seen an application they might find useful. (Hint: You might want to point them at a certain online magazine, too!)

The next time you’re with someone you know and they start talking about a computer related issue, tell them about Fedora and how it works for you. If they seem interested, give them some ideas on how Fedora could be helpful for them.

Open source may be big business nowadays, but it also remains a strong grassroots movement. You too can help grow open source through awareness and sharing!


Photo by Sharon McCutcheon from Unsplash.

Thursday, 21 November

09:19

Saturday Morning Breakfast Cereal - Your Kid [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
I don't know anything about my kids! I've reduced their entire world of negative emotions to 'probably teething' !


Today's News:

07:00

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner [The Cloudflare Blog]

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

Today, we’re excited to open source Flan Scan, Cloudflare’s in-house lightweight network vulnerability scanner. Flan Scan is a thin wrapper around Nmap that converts this popular open source tool into a vulnerability scanner with the added benefit of easy deployment.

We created Flan Scan after two unsuccessful attempts at using “industry standard” scanners for our compliance scans. A little over a year ago, we were paying a big vendor for their scanner until we realized it was one of our highest security costs and many of its features were not relevant to our setup. It became clear we were not getting our money’s worth. Soon after, we switched to an open source scanner and took on the task of managing its complicated setup. That made it difficult to deploy to our entire fleet of more than 190 data centers.

We had a deadline at the end of Q3 to complete an internal scan for our compliance requirements but no tool that met our needs. Given our history with existing scanners, we decided to set off on our own and build a scanner that worked for our setup. To design Flan Scan, we worked closely with our auditors to understand the requirements of such a tool. We needed a scanner that could accurately detect the services on our network and then lookup those services in a database of CVEs to find vulnerabilities relevant to our services. Additionally, unlike other scanners we had tried, our tool had to be easy to deploy across our entire network.

We chose Nmap as our base scanner because, unlike other network scanners which sacrifice accuracy for speed, it prioritizes detecting services thereby reducing false positives. We also liked Nmap because of the Nmap Scripting Engine (NSE), which allows scripts to be run against the scan results. We found that the “vulners” script, available on NSE, mapped the detected services to relevant CVEs from a database, which is exactly what we needed.

The next step was to make the scanner easy to deploy while ensuring it outputted actionable and valuable results. We added three features to Flan Scan which helped package up Nmap into a user-friendly scanner that can be deployed across a large network.

  • Easy Deployment and Configuration - To create a lightweight scanner with easy configuration, we chose to run Flan Scan inside a Docker container. As a result, Flan Scan can be built and pushed to a Docker registry and maintains the flexibility to be configured at runtime. Flan Scan also includes sample Kubernetes configuration and deployment files with a few placeholders so you can get up and scanning quickly.
  • Pushing results to the Cloud - Flan Scan adds support for pushing results to a Google Cloud Storage Bucket or an S3 bucket. All you need to do is set a few environment variables and Flan Scan will do the rest. This makes it possible to run many scans across a large network and collect the results in one central location for processing.
  • Actionable Reports - Flan Scan generates actionable reports from Nmap’s output so you can quickly identify vulnerable services on your network, the applicable CVEs, and the IP addresses and ports where these services were found. The reports are useful for engineers following up on the results of the scan as well as auditors looking for evidence of compliance scans.
Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample run of Flan Scan from start to finish. 

How has Flan Scan improved Cloudflare's network security?

By the end of Q3, not only had we completed our compliance scans, we also used Flan Scan to tangibly improve the security of our network. At Cloudflare, we pin the software version of some services in production because it allows us to prioritize upgrades by weighing the operational cost of upgrading against the improvements of the latest version. Flan Scan’s results revealed that our FreeIPA nodes, used to manage Linux users and hosts, were running an outdated version of Apache with several medium severity vulnerabilities. As a result, we prioritized their update. Flan Scan also found a vulnerable instance of PostgreSQL leftover from a performance dashboard that no longer exists.

Flan Scan is part of a larger effort to expand our vulnerability management program. We recently deployed osquery to our entire network to perform host-based vulnerability tracking. By complementing osquery’s findings with Flan Scan’s network scans we are working towards comprehensive visibility of the services running at our edge and their vulnerabilities. With two vulnerability trackers in place, we decided to build a tool to manage the increasing number of vulnerability  sources. Our tool sends alerts on new vulnerabilities, filters out false positives, and tracks remediated vulnerabilities. Flan Scan’s valuable security insights were a major impetus for creating this vulnerability tracking tool.

How does Flan Scan work?

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

The first step of Flan Scan is running an Nmap scan with service detection. Flan Scan's default Nmap scan runs the following scans:

  1. ICMP ping scan - Nmap determines which of the IP addresses given are online.
  2. SYN scan - Nmap scans the 1000 most common ports of the IP addresses which responded to the ICMP ping. Nmap marks ports as open, closed, or filtered.
  3. Service detection scan - To detect which services are running on open ports Nmap performs TCP handshake and banner grabbing scans.

Other types of scanning such as UDP scanning and IPv6 addresses are also possible with Nmap. Flan Scan allows users to run these and any other extended features of Nmap by passing in Nmap flags at runtime.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Nmap output

Flan Scan adds the "vulners" script tag in its default Nmap command to include in the output a list of vulnerabilities applicable to the services detected. The vulners script works by making API calls to a service run by vulners.com which returns any known vulnerabilities for the given service.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Nmap output with Vulners script

The next step of Flan Scan uses a Python script to convert the structured XML of Nmap’s output to an actionable report. The reports of the previous scanner we used listed each of the IP addresses scanned and present the vulnerabilities applicable to that location. Since we had multiple IP addresses running the same service, the report would repeat the same list of vulnerabilities under each of these IP addresses. This meant scrolling back and forth on documents hundreds of pages long to obtain a list of all IP addresses with the same vulnerabilities.  The results were impossible to digest.

Flan Scans results are structured around services. The report enumerates all vulnerable services with a list beneath each one of relevant vulnerabilities and all IP addresses running this service. This structure makes the report shorter and actionable since the services that need to be remediated can be clearly identified. Flan Scan reports are made using LaTeX because who doesn’t like nicely formatted reports that can be generated with a script? The raw LaTeX file that Flan Scan outputs can be converted to a beautiful PDF by using tools like pdf2latex or TeXShop.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Flan Scan report

What’s next?

Cloudflare’s mission is to help build a better Internet for everyone, not just Internet giants who can afford to buy expensive tools. We’re open sourcing Flan Scan because we believe it shouldn’t cost tons of money to have strong network security.

You can get started running a vulnerability scan on your network in a few minutes by following the instructions on the README. We welcome contributions and suggestions from the community.

Wednesday, 20 November

09:30

Even faster connection establishment with QUIC 0-RTT resumption [The Cloudflare Blog]

Even faster connection establishment with QUIC 0-RTT resumption

One of the more interesting features introduced by TLS 1.3, the latest revision of the TLS protocol, was the so called “zero roundtrip time connection resumption”, a mode of operation that allows a client to start sending application data, such as HTTP requests, without having to wait for the TLS handshake to complete, thus reducing the latency penalty incurred in establishing a new connection.

The basic idea behind 0-RTT connection resumption is that if the client and server had previously established a TLS connection between each other, they can use information cached from that session to establish a new one without having to negotiate the connection’s parameters from scratch. Notably this allows the client to compute the private encryption keys required to protect application data before even talking to the server.

However, in the case of TLS, “zero roundtrip” only refers to the TLS handshake itself: the client and server are still required to first establish a TCP connection in order to be able to exchange TLS data.

Even faster connection establishment with QUIC 0-RTT resumption

Zero means zero

QUIC goes a step further, and allows clients to send application data in the very first roundtrip of the connection, without requiring any other handshake to be completed beforehand.

Even faster connection establishment with QUIC 0-RTT resumption

After all, QUIC already shaved a full round-trip off of a typical connection’s handshake by merging the transport and cryptographic handshakes into one. By reducing the handshake by an additional roundtrip, QUIC achieves real 0-RTT connection establishment.

It literally can’t get any faster!

Attack of the clones

Unfortunately, 0-RTT connection resumption is not all smooth sailing, and it comes with caveats and risks, which is why Cloudflare does not enable 0-RTT connection resumption by default. Users should consider the risks involved and decide whether to use this feature or not.

For starters, 0-RTT connection resumption does not provide forward secrecy, meaning that a compromise of the secret parameters of a connection will trivially allow compromising the application data sent during the 0-RTT phase of new connections resumed from it. Data sent after the 0-RTT phase, meaning after the handshake has been completed, would still be safe though, as TLS 1.3 (and QUIC) will still perform the normal key exchange algorithm (which is forward secret) for data sent after the handshake completion.

More worryingly, application data sent during 0-RTT can be captured by an on-path attacker and then replayed multiple times to the same server. In many cases this is not a problem, as the attacker wouldn’t be able to decrypt the data, which is why 0-RTT connection resumption is useful, but in some cases this can be dangerous.

For example, imagine a bank that allows an authenticated user (e.g. using HTTP cookies, or other HTTP authentication mechanisms) to send money from their account to another user by making an HTTP request to a specific API endpoint. If an attacker was able to capture that request when 0-RTT connection resumption was used, they wouldn’t be able to see the plaintext and get the user’s credentials, because they wouldn’t know the secret key used to encrypt the data; however they could still potentially drain that user’s bank account by replaying the same request over and over:

Even faster connection establishment with QUIC 0-RTT resumption

Of course this problem is not specific to banking APIs: any non-idempotent request has the potential to cause undesired side effects, ranging from slight malfunctions to serious security breaches.

In order to help mitigate this risk, Cloudflare will always reject 0-RTT requests that are obviously not idempotent (like POST or PUT requests), but in the end it’s up to the application sitting behind Cloudflare to decide which requests can and cannot be allowed with 0-RTT connection resumption, as even innocuous-looking ones can have side effects on the origin server.

To help origins detect and potentially disallow specific requests, Cloudflare also follows the techniques described in RFC8470. Notably, Cloudflare will add the Early-Data: 1 HTTP header to requests received during 0-RTT resumption that are forwarded to origins.

Origins able to understand this header can then decide to answer the request with the 425 (Too Early) HTTP status code, which will instruct the client that originated the request to retry sending the same request but only after the TLS or QUIC handshake have fully completed, at which point there is no longer any risk of replay attacks. This could even be implemented as part of a Cloudflare Worker.

Even faster connection establishment with QUIC 0-RTT resumption

This makes it possible for origins to allow 0-RTT requests for endpoints that are safe, such as a website’s index page which is where 0-RTT is most useful, as that is typically the first request a browser makes after establishing a connection, while still protecting other endpoints such as APIs and form submissions. But if an origin does not provide any of those non-idempotent endpoints, no action is required.

One stop shop for all your 0-RTT needs

Just like we previously did for TLS 1.3, we now support 0-RTT resumption for QUIC as well. In honor of this event, we have dusted off the user-interface controls that allow Cloudflare users to enable this feature for their websites, and introduced a dedicated toggle to control whether 0-RTT connection resumption is enabled or not, which can be found under the “Network” tab on the Cloudflare dashboard:

Even faster connection establishment with QUIC 0-RTT resumption

When TLS 1.3 and/or QUIC (via the HTTP/3 toggle) are enabled, 0-RTT connection resumption will be automatically offered to clients that support it, and the replay mitigation mentioned above will also be applied to the connections making use of this feature.

In addition, if you are a user of our open-source HTTP/3 patch for NGINX, after updating the patch to the latest version, you’ll be able to enable support for 0-RTT connection resumption in your own NGINX-based HTTP/3 deployment by using the built-in “ssl_early_data” option, which will work for both TLS 1.3 and QUIC+HTTP/3.

09:11

Saturday Morning Breakfast Cereal - Superior Intelligence [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
The twist is that the robots are supplying the dog videos.


Today's News:

01:00

Set up single sign-on for Fedora Project services [Fedora Magazine]

In addition to an operating system, the Fedora Project provides services for users and developers. Services such as Ask Fedora, the Fedora Project wiki and the Fedora Project mailing lists help users learn how to best take advantage of Fedora. For developers of Fedora, there are many other services such as dist-git, Pagure, Bodhi, COPR and Bugzilla for the packaging and release process.

These services are available with a free account from the Fedora Accounts System (FAS). This account is the passport to all things Fedora! This article covers how to get set up with an account and configure Fedora Workstation for browser single sign-on.

Signing up for a Fedora account

To create a FAS account, browse to the account creation page. Here, you will fill out your basic identity data:

Account creation page

Once you enter your data, the account system sends an email to the address you provided, with a temporary password. Pick a strong password and use it.

Password reset page

Next, the account details page appears. If you want to contribute to the Fedora Project, you should complete the Contributor Agreement now. Otherwise, you are done and you can use your account to log into the various Fedora services.

Account details page

Configuring Fedora Workstation for single sign-On

Now that you have your account, you can sign into any of the Fedora Project services. Most of these services support single sign-on (SSO), so you can sign in without re-entering your username and password.

Fedora Workstation provides an easy workflow to add your Fedora credentials. The GNOME Online Accounts tool helps you quickly set up your system to access many popular services. To access it, go to the Settings menu.

Click on the option labeled Fedora. A prompt opens for you to provide your username and password for your Fedora Account.

GNOME Online Accounts stores your password in GNOME Keyring and automatically acquires your single-sign-on credentials for you when you log in.

Single sign-on with a web browser

Today, Fedora Workstation supports three web browsers out of the box with support for single sign-on with the Fedora Project services. These are Mozilla Firefox, GNOME Web, and Google Chrome.

Due to a bug in Chromium, single sign-on doesn’t work currently if you have more than one set of Kerberos (SSO) credentials active on your session. As a result, Fedora doesn’t enable this function out of the box for Chromium in Fedora.

To sign on to a service, browse to it and select the login option for that service. For most Fedora services, this is all you need to do; the browser handles the rest. Some services such as the Fedora mailing lists and Bugzilla support multiple login types. For them, select the Fedora or Fedora Account System login type.

That’s it! You can now log into any of the Fedora Project services without re-entering your password.

Special consideration for Google Chrome

To enable single sign-on out of the box for Google Chrome, Fedora takes advantage of certain features in Chrome that are intended for use in “managed” environments. A managed environment is traditionally a corporate or other organization that sets certain security and/or monitoring requirements on the browser.

Recently, Google Chrome changed its behavior and it now reports Managed by your organization or possibly Managed by fedoraproject.org under the ⋮ menu in Google Chrome. That link leads to a page that says, “If your Chrome browser is managed, your administrator can set up or restrict certain features, install extensions, monitor activity, and control how you use Chrome.” However, Fedora will never monitor your browser activity or restrict your actions.

Enter chrome://policy in the address bar to see exactly what settings Fedora has enabled in the browser. The AuthNegotiateDelegateWhitelist and AuthServerWhitelist options will be set to *.fedoraproject.org. These are the only changes Fedora makes.

Tuesday, 19 November

17:00

Organizing and Securing Third-Party CDN Assets at Yelp [Yelp Engineering and Product Blog]

At Yelp, we use a service-oriented architecture to serve our web pages. This consists of a lot of frontend services, each of which is responsible for serving different pages (e.g., the search page or a business listing page). In these frontend services, we use a couple of third-party JavaScript/CSS assets (React, Babel polyfill, etc.) to render our web pages. We chose to serve such assets using a third-party Content Delivery Network (CDN) for better performance. In the past, if a frontend service needed to use a third-party JavaScript/CSS asset, engineers had to hard-code its CDN URL. For example: <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.8.3/jquery.min.js"...

08:21

Saturday Morning Breakfast Cereal - Simile [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Full credit to Jess Zimmerman, whom I believe pointed out the nature of this idea first!


Today's News:

02:33

Sailfish X for Sony Xperia 10 now available [Jolla Blog]

Today we are happy to announce the availability of Sailfish X for Sony Xperia 10. We also introduce a campaign giving all existing Sailfish customers a nice offer on the Sailfish X licence for Xperia 10, and for other devices.

As the latest additions to the Sailfish X product family, the Xperia 10 and Xperia 10 Plus have been reviewed as good value-for-money devices with eye-catching 6 and 6.5-inch 21:9 displays, and premium build quality. The devices are also the first Sailfish devices to come with user data encryption enabled by default. We think they’re great devices and we think you’ll love them too.

The Xperia 10 and Xperia 10 Plus can fully utilise all the latest features and updates in the recently announced Sailfish OS 3.2.0 Torronsuo release, including the latest hardware adaptation support updates, the enhanced security features, the latest Android App Support and more.

 

*Sailfish X offer for all Sailfish users

To celebrate the release of Sailfish X for the Sony Xperia 10 and Xperia 10 Plus, we have a special offer for all current Sailfish customers: you can now purchase a new Sailfish X licence for any supported Sailfish device for just 29.90€ for a limited time (normal price 49.90€). No matter which Sailfish based device you’ve been using, as long as you have purchased a licence or a Sailfish OS device, and activated your Jolla account, this is for you!

To be clear, this can be any Jolla branded phone / tablet, Intex Aquafish, Sony Xperia, Gemini PDA etc. The offer is valid only for a limited time period until December 31, 2019, so if you want to get a new Sailfish device, now is a good time for it!

To utilise the offer, just go to the Jolla Shop and log in with your Jolla account.

 

 

Sailfish X survey gave us valuable feedback

In order to better understand the wishes for the Sailfish X program, we conducted a survey during the summer of 2019. The response was phenomenal and many Sailfish X users gave us valuable feedback on satisfaction levels, availability, pricing, and other topics.

One of the questions was about wishes and willingness to switch to newer Sailfish X devices. Over 63% of respondents answered that they are interested in moving to a newer device with Sailfish X. This is one of the reasons why we’ve focused our efforts on introducing new devices and also why we are today announcing this special offer for current Sailfish customers.

We’ve also been exploring the possibility of switching to a subscription-based model for the Sailfish X program. From the survey we found out that the majority of respondents clearly do not support this idea, and hence we’ve decided not to continue on this path for now. Sailfish X is a community program at the end of the day, and we value the feedback a lot. Thanks to all survey participants!

 

About Android app support for Sony Xperia X

The Sony Xperia X device has been the pilot and the flagship for the Sailfish X program. It has now been two years since the first release of Sailfish X for Sony Xperia X and we’ve released many software updates supporting it up until now, and with more to come in the future.

Support for new device options is a constant request from Sailfish X program users. Adding new devices to the Sailfish X portfolio comes with a cumulative maintenance cost. This has resulted in a decision that we won’t be upgrading the Android app support to Android 8.1 on the Xperia X, or other older generation devices like the Jolla C.

The problem is that porting Android app support 8.1 to Xperia X would mean that we update the baseport to Android 8 on that device. We can’t do this over the air, which means that we would have to create and maturize new HW adaptation for Xperia X and either stop supporting old adaptation or to support two adaptations for Xperia X. Supporting two different adaptions for same device would obviously be more work than just having one. If we’d only support new adaptation with SW updates would mean that old users would need to reflash their devices or they would stop getting updates.

We simply don’t have the necessary resources to do it justice given the older hardware and several additional HW adaptation versions needed to support them now and in future.

This isn’t a decision we’ve taken lightly. We are rightly well-known in the industry for the exceptional long-term support we provide for all of our devices and we will naturally be providing all the regular Sailfish OS software updates for the Xperia X, including Android 4.4 support, just as we do for many other older devices. The recent release of Sailfish OS 3.2.0 Torronsuo underlines our commitment to this.

We hope you’ll enjoy Sailfish X on the new Xperia 10 and Xperia 10 Plus as much as we do and will take up our offer!

Keep on sailing,
Martin

The post Sailfish X for Sony Xperia 10 now available appeared first on Jolla Blog.

Monday, 18 November

09:30

Saturday Morning Breakfast Cereal - Pensive [Saturday Morning Breakfast Cereal]



Click here to go see the bonus panel!

Hovertext:
Why are people always trying to find out other people's thoughts? It's like they'be never met a people before.


Today's News:

09:25

G-Scout Enterprise and Cloud Security at Etsy [Code as Craft]

As companies are moving to the cloud, they are finding a need for security tooling to audit and analyze their cloud environments. Over the last few years, various tools have been developed for this purpose. We’ll look at some of them and consider the uses for them. Specifically, we’ll take a close look at G-Scout, a tool I developed while working at NCC Group to look for security misconfigurations in Google Cloud Platform (GCP); and G-Scout Enterprise, a new tool with the same purpose, but tailored to the needs of security engineers at Etsy. We’ll also consider G-Scout Enterprise’s role within an ecosystem of other cloud logging and monitoring tools used at Etsy.

Cloud environments have a convenient feature which you won’t get from on premise servers: they have APIs. It’s similar for all the major cloud providers. They have a REST API which provides information on what services are being used, what resources exist, and how they are configured. An authorized user can call these APIs through a command line tool, or programmatically through a client library.

Those APIs provide information which is useful for security purposes. A classic example is a storage bucket (S3, GCS, etc.) which has been made public. It could be publicly readable, or publicly writable. Since we can use the API to see the permissions on any bucket we own, we can look for misconfigured permissions. So we go through all the API data we have for all our storage buckets, and look for permissions assigned to allUsers, or allAuthenticatedUsers.

Here are some other common examples:

  • Firewall rules are too permissive.
  • Unencrypted database connections.
  • Users have excessive permissions.

Configuration Scanning Tools

Rather than making API calls and processing the data ad hoc, you can create a framework. A tool that will allow you, with a single command, to run various API calls to gather data on diverse resources, and then programmatically look for misconfigurations in that data. And in the end, you can have the tool place the results into a human-readable HTML report which you can browse according to your whims.

Scout 2 does all of the above for Amazon Web Services (AWS). G-Scout was created with a similar solution in mind as Scout 2, but for GCP. After Scout 2 there have followed plenty of other examples. Some, like G-Scout, have been open source, and others are available for purchase.

These tools continue to evolve. It is becoming increasingly common for companies to use more than one cloud provider. With this trend we’ve seen the creation of multi-cloud tools. Scout Suite has replaced Scout 2. Inspec supports AWS, Azure, and GCP.

And some of them have added features. Forseti Inventory stores the data collected in a SQL database (I’ve moved G-Scout in a similar direction, as we’ll see later). Forseti Enforcer will actually make changes to match policies. 

These features are useful, but not so much to a consultant, since a consultant shouldn’t want any permissions aside from viewer permissions. Scout 2 was designed for consulting. The user can get viewer permissions, run the tool, and leave no trace. Forseti, on the other hand, requires Organization Admin permissions, and creates a database and other resources within the organization that is being audited.

Difficulties With G-Scout

But the same basic functionality remains at the core of each of these tools. When it came to G-Scout, that core functionality worked well for smaller companies, or those less committed to GCP. But when there are hundreds of projects, thousands of instances, and many other resources, it becomes difficult to go through the results. 

Adding to this difficulty is the presence of false positives. Any automated tool is going to turn up false positives. Context may make something that seems like a finding at first glance, instead turn out to be acceptable. To return to our public storage bucket example, there are some cases where the content in the bucket is intended to be public. You can even serve a simple HTML website from a storage bucket. So it tends to fall to a human to go through and figure out which are false positives. Since it takes time to fix real findings, and the false positives don’t go away, running the tool frequently to see what’s new becomes untenable.

Finally, at Etsy, many of the findings G-Scout would turn up had already been found by other means, which we will explore a bit below.

We have a tool called Reactor. There is a stackdriver log sink for the organization, and those logs (with filters applied) go to a PubSub topic. There’s a cloud function that subscribes to that topic, and when it finds logs that match any of a further set of filters (the alerting rules) then it triggers an alert.

So for example, if someone makes a storage bucket public, an alert will trigger as soon as the corresponding stackdriver log is generated, rather than waiting for someone to run G-Scout at some point.

Here’s a partial example of a stackdriver log. As an API call to check IAM permissions would, it has all the information we need to trigger an alert. We see the user that granted the permission (in this case a service account). And below the fold we would see which role was assigned and which user it was assigned to.

Another point where we are alerting on misconfigurations is resource creation. We use Terraform for infrastructure as code. Before a Terraform apply is run, we have a series of unit tests that will be run by the pipeline. The unit tester runs tests for many of the same events which we alert on with the stackdriver logs. This includes the common example of a bucket being made public.

This is another process that is not so useful for a security consultant. But it’s better to catch misconfigurations in this way, than in the way Scout 2 or G-Scout would catch them, since this will prevent them from ever being implemented!

So we have what I’ll call a three-pronged approach to catching misconfigurations in GCP. These are the three prongs:

  • Terraform unit testing that is meant to catch misconfigurations before they go into effect. 
  • Stackdriver alerting that occurs when the resource is created or changed (whether those changes are made through Terraform or not).
  • And in case anything gets through the first two, we have the point in time audit of all GCP resources provided by G-Scout Enterprise.

In summary, G-Scout’s traditional purpose was proving minimally useful. It was difficult to make good use of the G-Scout reports. And as we’ve seen, the first two prongs will usually catch misconfigurations first. So I moved away from G-Scout, and toward a new creation: G-Scout Enterprise.

G-Scout Enterprise

The fundamental change is to replace the HTML report with a BigQuery data collection. In fact, at its core, G-Scout Enterprise is very simple. It’s mostly just something that takes API data and puts it into BigQuery. Then other systems can do with that data as they please. The rules that will trigger alerts can be written in our alerting system like any other alerts we have (though they can also easily be written in Python within G-Scout Enterprise). We are now putting all of our other data into BigQuery as well, so it’s all connected.

Users can query any of the tables, each of which corresponds to one GCP API endpoint. G-Scout Enterprise tables can be joined – and they can be joined to our other data sources as well. And we can be very specific: like looking for all roles where amellos@etsy.com is a member, without enshrining it in our ruleset, because we can run queries through the BigQuery console. Or we can run queries in the command line, with helper functions that allow us to query with Python rather than SQL.

We can make comparisons and track changes over time. It can also provide context to other alerts. For example, if we have an IP address from an SSH alert, we can get information about the instance which owns that IP address, such as what service account it has, or what Chef role it has. 

Or for instance, the following, more complicated scenario:

We run Nessus. Nessus is an automated vulnerability scanner. It has a library of vulnerabilities it looks for by making network requests. You give it a list of IPs and it goes through them all. We now have it running daily. With a network of any size the volume of findings will quickly become overwhelming. Many of them are informational or safely ignored. But the rest need to be triaged, and addressed in a systematic way.

Not all Nessus findings are created equal. The same vulnerability on two different instances may be much more concerning on one than the other: if one is exposed to the internet and the other is not; if one is working with PII and the other is not; if one is in development and the other in production, and so on. Most of the information which determines how concerned we are with a vulnerability can be found among the collection of data contained in G-Scout Enterprise. This has simplified our scanning workflow. Since we can do network analysis with the data in G-Scout Enterprise, we can identify which instances are accessible from where. That means we don’t have to scan from different perspectives. And it has improved the precision of our vulnerability triaging, since there is so much contextual data available.

So we go through the following process:

  1. Enumerate all instances in our GCP account.
  2. Discard duplicate instances (instances from the same template, e.g. our many identical web server instances).
  3. Run the Nessus scan and place the results into BigQuery.
  4. Create a joined table of firewall rules and instances which they apply to (matching tags).
  5. Take various network ranges (0.0.0.0/0, our corporate range, etc.), and for each firewall rule see if it allows traffic from that source.
  6. For instances with firewall rules that allow ingress from 0.0.0.0/0, see if the instance has a NatIP or is behind an external load balancer.
  7. Check whether the project the instance lives in is one of the projects classified as sensitive.
  8. Compute and assign scores according to the previous steps

And then we save the results into BigQuery. That gives us historical data. We can see if we are getting better or worse. We can see if we have certain troublemaker projects. We can empower our patch management strategy with a wealth of data.

Conclusion

That leaves us with a few main lessons gained from adapting G-Scout to Etsy:

  • It’s useful to store cloud environment info in a database. That makes it easier to work with, and easier to integrate with other data sources.
  • The needs of a consultant are different from the needs of a security engineer. Although there is crossover, different tools may better fit the needs of one or the other.
  • The three pronged alerting approach described above provides a comprehensive solution for catching security misconfigurations in the Cloud.

One last note is that we have plans to open source G-Scout Enterprise in the coming months.

01:50

Fedora shirts and sweatshirts from HELLOTUX [Fedora Magazine]

Linux clothes specialist HELLOTUX from Europe recently signed an agreement with Red Hat to make embroidered Fedora t-shirts, polo shirts and sweatshirts. They have been making Debian, Ubuntu, openSUSE, and other Linux shirts for more than a decade and now the collection is extended to Fedora.

Embroidered Fedora polo shirt.

Instead of printing, they use programmable embroidery machines to make the Fedora embroidery. All of the design work is made exclusively with Linux; this is a matter of principle.

Some photos of the embroidering process for a Fedora sweatshirt:

You can get Fedora polos and t-shirts in blue or black and the sweatshirt in gray here.

Oh, “just one more thing,” as Columbo used to say: Now, HELLOTUX pays the shipping fee for the purchase of two or more items, worldwide, if you order within a week from now. Order on the HELLOTUX website.

Sunday, 17 November

07:00

Log every request to corporate apps, no code changes required [The Cloudflare Blog]

Log every request to corporate apps, no code changes required

When a user connects to a corporate network through an enterprise VPN client, this is what the VPN appliance logs:

Log every request to corporate apps, no code changes required

The administrator of that private network knows the user opened the door at 12:15:05, but, in most cases, has no visibility into what they did next. Once inside that private network, users can reach internal tools, sensitive data, and production environments. Preventing this requires complicated network segmentation, and often server-side application changes. Logging the steps that an individual takes inside that network is even more difficult.

Cloudflare Access does not improve VPN logging; it replaces this model. Cloudflare Access secures internal sites by evaluating every request, not just the initial login, for identity and permission. Instead of a private network, administrators deploy corporate applications behind Cloudflare using our authoritative DNS. Administrators can then integrate their team’s SSO and build user and group-specific rules to control who can reach applications behind the Access Gateway.

When a request is made to a site behind Access, Cloudflare prompts the visitor to login with an identity provider. Access then checks that user’s identity against the configured rules and, if permitted, allows the request to proceed. Access performs these checks on each request a user makes in a way that is transparent and seamless for the end user.

However, since the day we launched Access, our logging has resembled the screenshot above. We captured when a user first authenticated through the gateway, but that’s where it stopped. Starting today, we can give your team the full picture of every request made to every application.

We’re excited to announce that you can now capture logs of every request a user makes to a resource behind Cloudflare Access. In the event of an emergency, like a stolen laptop, you can now audit every URL requested during a session. Logs are standardized in one place, regardless of whether you use multiple SSO providers or secure multiple applications, and the Cloudflare Logpush platform can send them to your SIEM for retention and analysis.

Auditing every login

Cloudflare Access brings the speed and security improvements Cloudflare provides to public-facing sites and applies those lessons to the internal applications your team uses. For most teams, these were applications that traditionally lived behind a corporate VPN. Once a user joined that VPN, they were inside that private network, and administrators had to take additional steps to prevent users from reaching things they should not have access to.

Access flips this model by assuming no user should be able to reach anything by default; applying a zero-trust solution to the internal tools your team uses. With Access, when any user requests the hostname of that application, the request hits Cloudflare first. We check to see if the user is authenticated and, if not, send them to your identity provider like Okta, or Azure ActiveDirectory. The user is prompted to login, and Cloudflare then evaluates if they are allowed to reach the requested application. All of this happens at the edge of our network before a request touches your origin, and for the user, it feels like the seamless SSO flow they’ve become accustomed to for SaaS apps.

Log every request to corporate apps, no code changes required

When a user authenticates with your identity provider, we audit that event as a login and make those available in our API. We capture the user’s email, their IP address, the time they authenticated, the method (in this case, a Google SSO flow), and the application they were able to reach.

Log every request to corporate apps, no code changes required

These logs can help you track every user who connected to an internal application, including contractors and partners who might use different identity providers. However, this logging stopped at the authentication. Access did not capture the next steps of a given user.

Auditing every request

Cloudflare secures both external-facing sites and internal resources by triaging each request in our network before we ever send it to your origin. Products like our WAF enforce rules to protect your site from attacks like SQL injection or cross-site scripting. Likewise, Access identifies the principal behind each request by evaluating each connection that passes through the gateway.

Once a member of your team authenticates to reach a resource behind Access, we generate a token for that user that contains their SSO identity. The token is structured as JSON Web Token (JWT). JWT security is an open standard for signing and encrypting sensitive information. These tokens provide a secure and information-dense mechanism that Access can use to verify individual users. Cloudflare signs the JWT using a public and private key pair that we control. We rely on RSA Signature with SHA-256, or RS256, an asymmetric algorithm, to perform that signature. We make the public key available so that you can validate their authenticity, as well.

When a user requests a given URL, Access appends the user identity from that token as a request header, which we then log as the request passes through our network. Your team can collect these logs in your preferred third-party SIEM or storage destination by using the Cloudflare Logpush platform.

Cloudflare Logpush can be used to gather and send specific request headers from the requests made to sites behind Access. Once enabled, you can then configure the destination where Cloudflare should send these logs. When enabled with the Access user identity field, the logs will export to your systems as JSON similar to the logs below.

{
   "ClientIP": "198.51.100.206",
   "ClientRequestHost": "jira.widgetcorp.tech",
   "ClientRequestMethod": "GET",
   "ClientRequestURI": "/secure/Dashboard/jspa",
   "ClientRequestUserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36",
   "EdgeEndTimestamp": "2019-11-10T09:51:07Z",
   "EdgeResponseBytes": 4600,
   "EdgeResponseStatus": 200,
   "EdgeStartTimestamp": "2019-11-10T09:51:07Z",
   "RayID": "5y1250bcjd621y99"
   "RequestHeaders":{"cf-access-user":"srhea"},
}
 
{
   "ClientIP": "198.51.100.206",
   "ClientRequestHost": "jira.widgetcorp.tech",
   "ClientRequestMethod": "GET",
   "ClientRequestURI": "/browse/EXP-12",
   "ClientRequestUserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36",
   "EdgeEndTimestamp": "2019-11-10T09:51:27Z",
   "EdgeResponseBytes": 4570,
   "EdgeResponseStatus": 200,
   "EdgeStartTimestamp": "2019-11-10T09:51:27Z",
   "RayID": "yzrCqUhRd6DVz72a"
   "RequestHeaders":{"cf-access-user":"srhea"},
}

In the example above, the user initially visited the splash page for a sample Jira instance. The next request was made to a specific Jira ticket, EXP-12, about one minute after the first request. With per-request logging, Access administrators can review each request a user made once authenticated in the event that an account is compromised or a device stolen.

The logs are consistent across all applications and identity providers. The same standard fields are captured when contractors login with their AzureAD instance to your supply chain tool as when your internal users authenticate with Okta to your Jira. You can also augment the data above with other request details like TLS cipher used and WAF results.

How can this data be used?

The native logging capabilities of hosted applications vary wildly. Some tools provide more robust records of user activity, but others would require server-side code changes or workarounds to add this level of logging. Cloudflare Access can give your team the ability to skip that work and introduce logging in a single gateway that applies to all resources protected behind it.

The audit logs can be exported to third-party SIEM tools or S3 buckets for analysis and anomaly detection. The data can also be used for audit purposes in the event that a corporate device is lost or stolen. Security teams can then use this to recreate user sessions from logs as they investigate.

What’s next?

Any enterprise customer with Logpush enabled can now use this feature at no additional cost. Instructions are available here to configure Logpush and additional documentation here to enable Access per-request logs.

Friday, 15 November

01:00

Fedora pastebin and fpaste updates [Fedora Magazine]

Fedora and EPEL users who use fpaste to paste and share snippets of text might have noticed some changes recently. Recently, an update went out which sends pastes made by fpaste to the CentOS Pastebin instead of the Modern Paste instance that Fedora was running. Don’t fear — this was an intentional change, and is part of the effort to lower the workload within the Fedora Infrastructure and Community Platform Engineering teams. Keep reading to learn more about what’s happening with pastebin and your pastes.

About the service

A pastebin lets you save text on a website for a length of time. This helps you exchange data easily with other users. For example, you can post error messages for help with a bug or other issue.

The CentOS Pastebin is a community-maintained service that keeps pastes around for up to 24 hours. It also offers syntax highlighting for a large number of programming and markup languages.

As before, you can paste files:

$ fpaste sql/010.add_owner_ip_index.sql 
Uploading (0.1KiB)...
https://paste.centos.org/view/6ee941cc

…or command output…

$ rpm -ql python3 | fpaste
Uploading (0.7KiB)...
https://paste.centos.org/view/44945a99

…or system information:

$ fpaste --sysinfo 
Gathering system info .............Uploading (8.1KiB)...
https://paste.centos.org/view/8d5bb827

What to expect from Pastebin

On December 1st, 2019, Fedora Infrastructure will turn off its Modern Paste servers. It will then redirect fpaste.org, www.fpaste.org, and paste.fedoraproject.org to paste.centos.org.

If you notice any issues with fpaste, first try updating your fpaste package. On Fedora use this command:

$ dnf update fpaste

Or, on machines that use the EPEL repository, use this command:

$ yum update fpaste

If you still run into issues, please file a bug on the fpaste issue tracker, and please be as detailed as possible. Happy pasting!


Photo by Kelly Sikkema on Unsplash.

Thursday, 14 November

Wednesday, 13 November

17:01

Cloudflare 深圳商务与技术分享会 [The Cloudflare Blog]

Cloudflare 深圳商务与技术分享会
Cloudflare 深圳商务与技术分享会

深圳素有“中国硅谷”之称,有利的政策支持和相似的硅谷价值观吸了大量的科技公司的入驻和发展,顶级和完整的创业生态系统也孵化了许多全球瞩目的创业公司。本土科技巨头如华为和腾讯的都在此设立的研发中心。

最近一年来,Cloudflare在产品发展和服务能力上都有很多让人兴奋的进展,包括:网络规模已覆盖超过194个城市,新推出的机器人管理Bot Management和清洗中心 Magic Transit等产品。Cloudflare亚太区和中国区员工怀着饱满的热情在深圳南山区的万豪酒店举办了Cloudflare 在中国内地第一个商务和技术分享会,并诚挚邀请了深圳和周边的客户和合作伙伴一起参与。这次分享会采用定向邀请的方式,受邀的嘉宾包括各行业的客户领导和专家,合作伙伴商务主管和技术大牛。这是一个非常独特的机会,各位嘉宾在轻松舒适的环境中与Cloudflare的商务代表和技术专家讨论产品技术,分享使用体验。

商务分享

分享会于2019年10月31号下午2点开始,客户和合作伙伴嘉宾准时到场入座,我们伴以轻松的音乐和定制礼品与茶点迎接各方贵客。

Cloudflare 深圳商务与技术分享会
Cloudflare 深圳商务与技术分享会

Xavier Cai,中国区总经理,负责Cloudflare在中国大陆地区的整体发展和业务增长。他有超过15年的行业和区域的销售管理经验,曾在F5 和HP/HPE等跨国公司担任高级业务主管。

Xavier携客户成功经理Sophie共同分享了Cloudflare 在中国的业务发展情况,特别介绍了本土团队的建立与发展。Cloudflare中国区协同亚太区和美国总部,组建了一支高效、专业的团队,为中国区的客户和合作伙伴提供中、英文双语7x24小时不间断的商务服务和技术支持,消除了大家对获取及时服务和支持以及语言障碍的顾虑。Xavier分享了跨国团队的优势所在:Cloudflare利用庞大的全球网络优势和最前沿的产品和服务能力,结合全球各行业经验与合作伙伴资源,高效率并且低成本地帮助中国本地客户解决业务难题、消除痛点。同时,Cloudflare也非常注重聆听来自中国本土客户宝贵意见和反馈,帮助优化产品功能,提升服务效率。

Workers 与 Access的敏捷开发技巧

性能优化与安全技术的有效结合

孟鑫,是我们亚太区资深解决方案工程师,毕业于新加坡国立大学。曾任职于Merrill Lynch、Qualitics、Singtel等知名国际企业,帮助客户针对网络安全、优化及架构等方面提供解决方案。

他用丰富的现有客户解决方案的案例中抽取精华,分享了如何用Cloudflare Worker与Access进行快速开发,提供了最具代表性的案例供客户参考,解答了重点问题如:

  • 什么是无服务器架构 (Serverless)?
  • 什么是 Worker ?
  • 如何使用Worker进行开发的最佳实践 ?
Cloudflare 深圳商务与技术分享会
Cloudflare 深圳商务与技术分享会

Bot Management 与 Magic Transit

安全防护与性能优化高阶产品的推出

Cloudflare Bot Management 又称机器人管理。通过2000多万个的网站提供的流量数据,有效地侦测、管理与拦截自动机器人以及爬虫攻击,一键启用。

Cloudflare Bot Management可有效解决以下常见攻击:

  • 暴露破解 :通过已经被盗的其他网站的登录信息,暴力尝试登录,并以此盗取用户信息
  • 恶意爬虫: 恶意爬取并窃取网页内容
  • 广告虚假点击 :自动化地模拟广告链接的点击,并在联系页面中填写错误或者垃圾信息
  • 霸占库存 :假意购买商品霸占库存,使正常的客户流失,或者二手高价出售
  • 信用卡验证破解 :通过被盗信用卡号进行暴力验证并恶意消费

孟鑫分享了Bot Management 的强大功能。 Cloudflare 遍布全球的网络提供了巨大的数据库信息,使Bot Management 可通过机器学习模块(Machine Learning),行为学习模块(Behavioral Analysis)和验证白名单模块(Automatic Whitelist) 更加智能,在这个“数据为王”的信息安全时代里一马当先。

而Cloudflare Magit Transit 是Cloudflare今年推出另一个备受瞩目的的产品。拥有30+ Tbps DDoS 全球清洗容量使Cloudflare服务扩展至网络层。孟鑫重点介绍了Magit Transit 的工作方式;通过BGP广播将客户流量引导至Cloudflare的数据中心,所有导入流量会被自动清洗与保护,清洗流量会被导回客户数据中心。Anycast GRE tunnels技术帮助简化配置步骤。全球著名的Cloudflare DDoS防御技术将为客户的数据中心所用。

我们的分享将持续进行, 敬请期待 !


我们在这个分享会的期间特意设置了问答和双向讨论环节,配以精美的定制茶歇,由此获得客户许多的第一手宝贵的反馈和建议。分享会在非常轻松与活跃的氛围下从开始走到尾声。会后,很多客户驻留交谈,分享见解。

Cloudflare 深圳商务与技术分享会

中场茶歇期间,Cloudflare 团队与客户相聊甚欢。

Cloudflare 深圳商务与技术分享会

感谢各方客户与合作伙伴的积极热情参与,颇具建设性的反馈让我们对接下来的活动策划增添了更多的信心,同时也鼓励了我们去安排更多针对性的分享内容。我们在2020年将举办另外两场分享会。如果您有任何想法和建议(比如您期望在哪个城市举办,您对哪些主题感兴趣等),请在评论区提出,我们期待将您的见解融汇到下一次分享会中!

到场客户的珍贵留影:

Cloudflare 深圳商务与技术分享会
Cloudflare 深圳分享会的参与同事: Sophie Qiu (组织者)| Xavier Cai | Xin Meng | George Sun | Vincent Liu | Bruce Li | Alex Thiang | Ellen Miao | Adam Luo | Xiaojin Liu

00:52

Edit images on Fedora easily with GIMP [Fedora Magazine]

GIMP (short for GNU Image Manipulation Program) is free and open-source image manipulation software. With many capabilities ranging from simple image editing to complex filters, scripting and even animation, it is a good alternative to popular commercial options.

Read on to learn how to install and use GIMP on Fedora. This article covers basic daily image editing.

Installing GIMP

GIMP is available in the official Fedora repository. To install it run:

sudo dnf install gimp

Single window mode

Once you open the application, it shows you the dark theme window with toolbox and the main editing area. Note that it has two window modes that you can switch between by selecting Windows -> Single Window Mode. By checking this option all components of the UI are displayed in a single window. Otherwise, they will be separate.

Loading an image

Fedora 30 Background

To load an image, go to File -> Open and choose your file and choose your image file.

Resizing an image

To resize the image, you have the option to resize based on a couple of parameters, including pixel and percentage — the two parameters which are often handy in editing images.

Let’s say we need to scale down the Fedora 30 background image to 75% of its current size. To do that, select Image -> Scale and then on the scale dialog, select percentage in the unit drop down. Next, enter 75 as width or height and press the Tab key. By default, the other dimension will automatically resize in correspondence with the changed dimension to preserve aspect ratio. For now, leave other options unchanged and press Scale.

Scale Dialog In GIMP

The image scales to 0.75 percent of its original size.

Rotating images

Rotating is a transform operation, so you find it under Image -> Transform from the main menu, where there are options to rotate the image by 90 or 180 degrees. There are also options for flipping the image vertically or horizontally under the mentioned option.

Let’s say we need to rotate the image 90 degrees. After applying a 90-degree clockwise rotation and horizontal flip, our image will look like this:

Transforming an image with GIMP

Adding text

Adding text is very easy. Just select the A icon from the toolbox, and click on a point on your image where you want to add the text. If the toolbox is not visible, open it from Windows->New Toolbox.

As you edit the text, you might notice that the text dialog has font customization options including font family, font size, etc.

Add Text To Images
Adding text to image in GIMP

Saving and exporting

You can save your edit as as a GIMP project with the xcf extension from File -> Save or by pressing Ctrl+S. Or you can export your image in formats such as PNG or JPEG. To export, go to File -> Export As or hit Ctrl+Shift+E and you will be presented with a dialog where you can select the output image and name.

Tuesday, 12 November

Monday, 11 November

00:44

Understanding “disk space math” [Fedora Magazine]

Everything in a PC, laptop, or server is represented as binary digits (a.k.a. bits, where each bit can only be 1 or 0). There are no characters like we use for writing or numbers as we write them anywhere in a computer’s memory or secondary storage such as disk drives. For general purposes, the unit of measure for groups of binary bits is the byte — eight bits. Bytes are an agreed-upon measure that helped standardize computer memory, storage, and how computers handled data.

There are various terms in use to specify the capacity of a disk drive (either magnetic or electronic). The same measures are applied to a computers random access memory (RAM) and other memory devices that inhabit your computer. So now let’s see how the numbers are made up.

Prefixes are used with the number that specifies the capacity of the device. The prefixes designate a multiplier that is to be applied to the number that preceded the prefix. Commonly used prefixes are:

  • Kilo = 103 = 1,000 (one thousand)
  • Mega = 106 = 1,000,000 (one million)
  • Giga = 109 = 1000,000,000 (one billion)
  • Tera = 1012 = 1,000,000,000,000 (one trillion)

As an example 500 GB (gigabytes) is 500,000,000,000 bytes.

The units that memory and storage are specified in  advertisements, on boxes in the store, and so on are in the decimal system as shown above. However since computers only use binary bits, the actual capacity of these devices is different than the advertised capacity.

You saw that the decimal numbers above were shown with their equivalent powers of ten. In the binary system numbers can be represented as powers of two. The table below shows how bits are used to represent powers of two in an 8 bit Byte. At the bottom of the table there is an example of how the decimal number 109 can be represented as a binary number that can be held in a single byte of 8 bits (01101101).

Eight bit binary number

 

Bit 7

Bit 6

Bit 5

Bit 4

Bit 3

Bit 2

Bit 1

Bit 0

Power of 2

27

26

25

24

23

22

21

20

Decimal Value

128

64

32

16

8

4

2

1

Example Number

0

1

1

0

1

1

0

1

The example bit values comprise the binary number 01101101. To get the equivalent decimal value just add the decimal values from the table where the bit is set to 1. That is 64 + 32 + 8 + 4 + 1 = 109.

By the time you get out to 230 you have decimal 1,073,741,824 with just 31 bits (don’t forget the 20) You’ve got a large enough number to start specifying memory and storage sizes.

Now comes what you have been waiting for. The table below lists common designations as they are used for labeling decimal and binary values.

Decimal

Binary

KB (Kilobyte)

1KB = 1,000 bytes

KiB (Kibibyte)

1KiB = 1,024 bytes

MB (Megabyte)

1MB = 1,000,000 bytes

MiB (Mebibyte)

1MiB = 1,048,576 bytes

GB (Gigabyte)

1GB = 1,000,000,000 bytes

GiB (Gibibyte)

1 GiB (Gibibyte) = 1,073,741,824 bytes

TB (Terabyte)

1TB = 1,000,000,000,000

TiB (Tebibyte)

1TiB = 1,099,511,627,776 bytes

Note that all of the quantities of bytes in the table above are expressed as decimal numbers. They are not shown as binary numbers because those numbers would be more than 30 characters long.

Most users and programmers need not be concerned with the small differences between the binary and decimal storage size numbers. If you’re developing software or hardware that deals with data at the binary level you may need the binary numbers.

As for what this means to your PC: Your PC will make use of the full capacity of your storage and memory devices. If you want to see the capacity of your disk drives, thumb drives, etc, the Disks utility in Fedora will show you the actual capacity of the storage device in number of bytes as a decimal number.

There are also command line tools that can provide you with more flexibility in seeing how your storage bytes are being used. Two such command line tools are du (for files and directories) and df (for file systems). You can read about these by typing man du or man df at the command line in a terminal window.


Photo by Franck V. on Unsplash.

Sunday, 10 November

17:00

Remember Clusterman? Now It's Open-Source, and Supports Kubernetes Too! [Yelp Engineering and Product Blog]

Earlier this year, I wrote a blog post showing off some cool features of our in-house compute cluster autoscaler, Clusterman (our Cluster Manager). This time, I’m back with two announcements that I’m really excited about! Firstly, in the last few months, we’ve added another supported backend to Clusterman; so not only can it scale Mesos clusters, it can also scale Kubernetes clusters. Second, Clusterman is now open-source on GitHub so that you, too, can benefit from advanced autoscaling techniques for your compute clusters. If you prefer to just read the code, you can head there now to find some examples...

Friday, 08 November

01:00

Managing software and services with Cockpit [Fedora Magazine]

The Cockpit series continues to focus on some of the tools users and administrators can use to perform everyday tasks within the web user-interface. So far we’ve covered introducing the user-interface, storage and network management, and user accounts. Hence, this article will highlight how Cockpit handles software and services.

The menu options for Applications and Software Updates are available through Cockpit’s PackageKit feature. To install it from the command-line, run:

 sudo dnf install cockpit-packagekit

For Fedora Silverblue, Fedora CoreOS, and other ostree-based operating systems, install the cockpit-ostree package and reboot the system:

sudo rpm-ostree install cockpit-ostree; sudo systemctl reboot

Software updates

On the main screen, Cockpit notifies the user whether the system is updated, or if any updates are available. Click the Updates Available link on the main screen, or Software Updates in the menu options, to open the updates page.

RPM-based updates

The top of the screen displays general information such as the number of updates and the number of security-only updates. It also shows when the system was last checked for updates, and a button to perform the check. Likewise, this button is equivalent to the command sudo dnf check-update.

Below is the Available Updates section, which lists the packages requiring updates. Furthermore, each package displays the name, version, and best of all, the severity of the update. Clicking a package in the list provides additional information such as the CVE, the Bugzilla ID, and a brief description of the update. For details about the CVE and related bugs, click their respective links.

Also, one of the best features about Software Updates is the option to only install security updates. Distinguishing which updates to perform makes it simple for those who may not need, or want, the latest and greatest software installed. Of course, one can always use Red Hat Enterprise Linux or CentOS for machines requiring long-term support.

The example below demonstrates how Cockpit applies RPM-based updates.

Running system updates with RPM-based operating systems in Cockpit.

OSTree-based updates

The popular article What is Silverblue states:

OSTree is used by rpm-ostree, a hybrid package/image based system… It atomically replicates a base OS and allows the user to “layer” the traditional RPM on top of the base OS if needed.

Because of this setup, Cockpit uses a snapshot-like layout for these operating systems. As seen in the demo below, the top of the screen displays the repository (fedora), the base OS image, and a button to Check for Updates.

Clicking the repository name (fedora in the demo below) opens the Change Repository screen. From here one can Add New Repository, or click the pencil icon to edit an existing repository. Editing provides the option to delete the repository, or Add Another Key. To add a new repository, enter the name and URL. Also, select whether or not to Use trusted GPG key.

There are three categories that provide details of its respective image: Tree, Packages, and Signature. Tree displays basic information such as the operating system, version of the image, how long ago it was released, and the origin of the image. Packages displays a list of installed packages within that image. Signature verifies the integrity of the image such as the author, date, RSA key ID, and status.

The current, or running, image displays a green check-mark beside it. If something happens, or an update causes an issue, click the Roll Back and Reboot button. This restores the system to a previous image.

Running system updates with OSTree-based operating systems in Cockpit.

Applications

The Applications screen displays a list of add-ons available for Cockpit. This makes it easy to find and install the plugins required by the user. At the time of this article, some of the options include the 389 Directory Service, Fleet Commander, and Subscription Manager. The demo below shows a complete list of available Cockpit add-ons.

Also, each item displays the name, a brief description, and a button to install, or remove, the add-on. Furthermore, clicking the item displays more information (if available). To refresh the list, click the icon at the top-right corner.

Managing Cockpit application add-ons and features

Subscription Management

Subscription managers allow admins to attach subscriptions to the machine. Even more, subscriptions give admins control over user access to content and packages. One example of this is the famous Red Hat subscription model. This feature works in relation to the subscription-manager command

The Subscriptions add-on can be installed via Cockpit’s Applications menu option. It can also be installed from the command-line with:

sudo dnf install cockpit-subscriptions

To begin, click Subscriptions in the main menu. If the machine is currently unregistered, it opens the Register System screen. Next, select the URL. You can choose Default, which uses Red Hat’s subscription server, or enter a Custom URL. Enter the Login, Password, Activation Key, and Organization ID. Finally, to complete the process, click the Register button.

The main page for Subscriptions show if the machine is registered, the System Purpose, and a list of installed products.

Managing subscriptions in Cockpit

Services

To start, click the Services menu option. Because Cockpit uses systemd, we get the options to view System Services, Targets, Sockets, Timers, and Paths. Cockpit also provides an intuitive interface to help users search and find the service they want to configure. Services can also be filtered by it’s state: All, Enabled, Disabled, or Static. Below this is the list of services. Each row displays the service name, description, state, and automatic startup behavior.

For example, let’s take bluetooth.service. Typing bluetooth in the search bar automatically displays the service. Now, select the service to view the details of that service. The page displays the status and path of the service file. It also displays information in the service file such as the requirements and conflicts. Finally, at the bottom of the page, are the logs pertaining to that service.

Also, users can quickly start and stop the service by toggling the switch beside the service name. The three-dots to the right of that switch expands those options to Enable, Disable, Mask/Unmask the service

To learn more about systemd, check out the series in the Fedora Magazine starting with What is an init system?

Managing services in Cockpit

In the next article we’ll explore the security features available in Cockpit.

Thursday, 07 November

17:00

Inside TensorFlow [Yelp Engineering and Product Blog]

Inside TensorFlow It’s probably not surprising that Yelp utilizes deep neural networks in its quest to connect people with great local businesses. One example is the selection of photos you see in the Yelp app and website, where neural networks try to identify the best quality photos for the business displayed. A crucial component of our deep learning stack is TensorFlow (TF). In the process of deploying TF to production, we’ve learned a few things that may not be commonly known in the Data Science community. TensorFlow’s success stems not only from its popularity within the machine learning domain, but...

07:45

Tuning your bash or zsh shell on Fedora Workstation and Silverblue [Fedora Magazine]

This article shows you how to set up some powerful tools in your command line interpreter (CLI) shell on Fedora. If you use bash (the default) or zsh, Fedora lets you easily setup these tools.

Requirements

Some installed packages are required. On Workstation, run the following command:

sudo dnf install git wget curl ruby ruby-devel zsh util-linux-user redhat-rpm-config gcc gcc-c++ make

On Silverblue run:

sudo rpm-ostree install git wget curl ruby ruby-devel zsh util-linux-user redhat-rpm-config gcc gcc-c++ make

Note: On Silverblue you need to restart before proceeding.

Fonts

You can give your terminal a new look by installing new fonts. Why not fonts that display characters and icons together?

Nerd-Fonts

Open a new terminal and type the following commands:

git clone --depth=1 https://github.com/ryanoasis/nerd-fonts ~/.nerd-fonts
cd .nerd-fonts 
sudo ./install.sh

Awesome-Fonts

On Workstation, install using the following command:

sudo dnf install fontawesome-fonts

On Silverblue, type:

sudo rpm-ostree install fontawesome-fonts

Powerline

Powerline is a statusline plugin for vim, and provides statuslines and prompts for several other applications, including bash, zsh, tmus, i3, Awesome, IPython and Qtile. You can find more information about powerline on the official documentation site.

Installation

To install powerline utility on Fedora Workstation, open a new terminal and run:

sudo dnf install powerline vim-powerline tmux-powerline powerline-fonts

On Silverblue, the command changes to:

sudo rpm-ostree install powerline vim-powerline tmux-powerline powerline-fonts

Note: On Silverblue, before proceeding you need restart.

Activating powerline

To make the powerline active by default, place the code below at the end of your ~/.bashrc file

if [ -f `which powerline-daemon` ]; then
  powerline-daemon -q
  POWERLINE_BASH_CONTINUATION=1
  POWERLINE_BASH_SELECT=1
  . /usr/share/powerline/bash/powerline.sh
fi

Finally, close the terminal and open a new one. It will look like this:

Oh-My-Zsh

Oh-My-Zsh is a framework for managing your Zsh configuration. It comes bundled with helpful functions, plugins, and themes. To learn how set Zsh as your default shell this article.

Installation

Type this in the terminal:

sh -c "$(curl -fsSL https://raw.github.com/robbyrussell/oh-my-zsh/master/tools/install.sh)"

Alternatively, you can type this:

sh -c "$(wget https://raw.github.com/robbyrussell/oh-my-zsh/master/tools/install.sh -O -)"

At the end, you see the terminal like this:

Congratulations, Oh-my-zsh is installed.

Themes

Once installed, you can select your theme. I prefer to use the Powerlevel10k. One advantage is that it is 100 times faster than powerlevel9k theme. To install run this line:

git clone https://github.com/romkatv/powerlevel10k.git ~/.oh-my-zsh/themes/powerlevel10k

And set ZSH_THEME in your ~/.zshrc file

ZSH_THEME=powerlevel10k/powerlevel10k

Close the terminal. When you open the terminal again, the Powerlevel10k configuration wizard will ask you a few questions to configure your prompt properly.

After finish Powerline10k configuration wizard, your prompt will look like this:

If you don’t like it. You can run the powerline10k wizard any time with the command p10k configure.

Enable plug-ins

Plug-ins are stored in .oh-my-zsh/plugins folder. You can visit this site for more information. To activate a plug-in, you need edit your ~/.zshrc file. Install plug-ins means that you are going create a series of aliases or shortcuts that execute a specific function.

For example, to enable the firewalld and git plugins, first edit ~/.zshrc:

plugins=(firewalld git)

Note: use a blank space to separate the plug-ins names list.

Then reload the configuration

source ~/.zshrc 

To see the created aliases, use the command:

alias | grep firewall

Additional configuration

I suggest the install syntax-highlighting and syntax-autosuggestions plug-ins.

git clone https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting
git clone https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions

Add them to your plug-ins list in your file ~/.zshrc

plugins=( [plugins...] zsh-syntax-highlighting zsh-autosuggestions)

Reload the configuration

source ~/.zshrc 

See the results:

Colored folders and icons

Colorls is a Ruby gem that beautifies the terminal’s ls command, with colors and font-awesome icons. You can visit the official site for more information.

Because it’s a ruby gem, just follow this simple step:

sudo gem install colorls

To keep up to date, just do:

sudo gem update colorls

To prevent type colorls everytime you can make aliases in your ~/.bashrc or ~/.zshrc.

alias ll='colorls -lA --sd --gs --group-directories-first'
alias ls='colorls --group-directories-first'

Also, you can enable tab completion for colorls flags, just entering following line at end of your shell configuration:

source $(dirname $(gem which colorls))/tab_complete.sh

Reload it and see what it happens:

01:00

Quoi de neuf en Francophonie? [The Cloudflare Blog]

Retour sur le premier événement Cloudflare pour nos clients et prospects francophones

Quoi de neuf en Francophonie?

Cloudflare en France, Belgique et Suisse, ce sont plus d’une centaine de clients Enterprise, plusieurs milliers d’organisations sur les plans en self-service et une équipe de plus de quinze personnes pour accompagner nos clients francophones dans la gestion technique et commerciale de leur compte (Business Development Representatives, Customer Success Managers, Account Executives, Solutions Engineers, Support Engineers). L’année dernière encore, cette équipe ne comptait que cinq personnes.

Quoi de neuf en Francophonie?
Une partie de l'équipe Cloudflare 
Quoi de neuf en Francophonie?
Présentation générale de Cloudflare par David Lallement

Car nos clients aussi grandissent ! Des start-ups telles qu’Happn ou Back Market aux grands groupes tels que Solocal-Pages Jaunes en passant par les ONGs ou les organisations du secteur public comme L’Union Européenne de Radio-télévision, Cloudflare séduit de plus en plus en francophonie.

Cet événement était l’occasion pour cette communauté grandissante de prospects et clients de se rencontrer, s’écouter et échanger sur leurs problématiques, qu’elles soient d’ailleurs liées directement ou non à Cloudflare. Le tout dans un cadre convivial et détendu avec vue sur les Champs-Élysées depuis le balcon ensoleillé de la Maison du Danemark. Sans oublier quelques petits fours et un peu de vin (nous sommes en France après tout !).

Quoi de neuf en Francophonie?
Petit-déjeuner d'accueil
Quoi de neuf en Francophonie?
Vue sur les Champs-Élysées
Quoi de neuf en Francophonie?
Pause amicale avec notre client Deindeal, entreprise responsable des deux plates-formes e-commerce leaders des ventes flash en Suisse, Deindeal.ch et My-store.ch.
Quoi de neuf en Francophonie?
Pause sur le balcon de la Maison du Danemark

Mais assez des détails logistiques - de quoi avons-nous parlé pendant cette matinale ?

Au programme :

  • Présentation de Cloudflare et des nouveaux produits
  • Internet et le réseau Cloudflare
  • Présentation de Marlin Cloud sur l’implémentation de leur client AB Inbev
  • Présentation de Solocal-Pages Jaunes sur leur sélection et mise en place de Cloudflare
  • Panel clients avec retours d’expérience de nos clients Back Market, Oscaro et Ankama

Durant la première partie de la matinée, David Lallement (Account Executive, Cloudflare) a présenté Cloudflare en mettant l’accent sur les dernières nouveautés et développements à venir.

Quoi de neuf en Francophonie?
Présentation sur les nouveaux produits Cloudflare par David Lallement

Étienne Labaume, représentant l'équipe réseau de Cloudflare, a ensuite brièvement présenté l'architecture de l'Internet et Netperf, la solution conçue par Cloudflare qui a drastiquement réduit les erreurs 522.

Quoi de neuf en Francophonie?
Étienne Labaume, présentation sur le réseau Cloudflare 
Quoi de neuf en Francophonie?
Pourcentage de requêtes 522 vers l'origine

Enfin, Jürgen Coetsiers, co-fondateur de Marlin Cloud nous a expliqué, à travers l’exemple du RGPD, comment son entreprise a pu accompagner ses clients, dont un grand groupe multi-marques, à remplir leurs exigences conformité. Grâce aux solutions Cloudflare, notamment Access, Jürgen a présenté un ensemble de bonnes pratiques d’automatisation qui permettent aux applications multi-cloud de rentrer en conformité et de le rester.

Quoi de neuf en Francophonie?
Présentation sur le cas de leur client AB InBev par Jürgen Coetsiers

Durant la deuxième partie, Loïc Troquet, responsable du pôle Excellence Technique pour www.pagesjaunes.fr, a présenté leur parcours client et pourquoi ils ont choisi Cloudflare comme fournisseur de solutions de sécurité et de performance dans le cadre de leur transformation vers le « Cloud ». Loïc a également présenté des statistiques accessibles depuis leur tableau de bord, telles que la performance du cache et les analytiques du firewall, afin d'expliquer le rôle déterminant de ces données pour l'atteinte de leurs objectifs.

Quoi de neuf en Francophonie?
Présentation de Loïc Troquet sur la mise en place de Cloudflare pour Solocal-Pages Jaunes

Enfin, un panel clients nous a permis de partager les retours d’expérience de Théotime Lévèque, responsable DevOps chez Back Market, Sébastien Aperghis-Tramoni, ingénieur systèmes chez Oscaro et Samuel Delplace, CIO chez Ankama.

Quoi de neuf en Francophonie?
De gauche à droite : Théotime, Sébastien et Samuel partagent leurs retours d'expérience 

Ces présentations étaient ponctuées de questions et échanges informels entre les intervenants et l’audience, sans aucune forme de censure. Nous avons insisté pour que nos clients puissent parler librement de Cloudflare et partager leurs retours, qu’ils soient positifs ou négatifs.

Le résultat : une communauté de clients et prospects qui sentent qu’ils peuvent prendre la parole et s’approprier un événement avant tout organisé pour eux. Les retours récoltés auprès des intervenants et participants nous ont permis de confirmer l’importance de cette transparence, au coeur de la mission de Cloudflare.

Quelques témoignages :

« Pour nous, le meetup a permis de sortir la tête du guidon et de découvrir des fonctionnalités qui jusqu'ici n'avait pas suscité notre intérêt. La rencontre avec l'équipe Cloudflare et des clients existants nous a amené à explorer certaines fonctionnalités pour des cas concrets (Cloudflare Access, Image Resizing...). »

Sébastien Aperghis-Tramoni, Oscaro


« Je suis ravi d’avoir participé à cet évènement qui permet également de s’inspirer d’utilisations de Cloudflare dans d’autres contextes. »

Loïc Troquet, Solocal-Pages Jaunes


« Cette journée clients a été très enrichissante et nous a permis de partager des bonnes pratiques à travers différents cas d’usage des solutions Cloudflare. Cela nous a ouvert à de nouvelles opportunités et nouveaux défis à relever. »

Pascal Binard, Marlin Cloud


« J’ai participé à l’événement afin de me tenir informé des dernières évolutions de la plate-forme dans un format plus condensé que les articles du blog que je n’ai pas toujours le temps de suivre, et également pour rencontrer les clients. Il est en effet très important pour nous de pouvoir nous appuyer les uns sur les autres et d’échanger nos retours d’expériences. J’ai également apprécié que ma participation au panel ne soit pas censurée et de pouvoir partager les bons comme moins bons retours. »

Théotime Lévèque, Back Market


Quoi de neuf en Francophonie?
L'équipe « Customer Success » France de Cloudflare, de gauche à droite : Valentine, Lorène et David

Lors de notre enquête, les participants ont indiqué avoir particulièrement apprécié les opportunités d’échanges avec les clients, prospects et l’équipe Cloudflare, les mises à jour sur les nouveaux produits et développements futurs. Certains ont cependant aussi indiqué avoir souhaité davantage de détails sur la feuille de route. Nous allons prendre ces retours en compte pour la définition du contenu de notre prochain événement. N’hésitez pas à nous contacter si vous avez d’autres suggestions pour nous aider à améliorer la prochaine rencontre. Nous espérons vous y voir nombreux !

Globe-trotter ? N’hésitez pas à venir nous voir lors des prochains événements européens à Amsterdam ou encore Manchester.

Envie de rejoindre l’équipe Cloudflare ? N’hésitez pas à postuler ici !


Quelques mots sur nos intervenants et leur entreprise :

Jürgen Cotsiers, Co-fondateur, www.marlincloud.com
Marlin Marketing Cloud est un logiciel SaaS permettant d’assurer le succès de votre marketing digital. Pendant que votre agence crée et développe le contenu de votre prochaine campagne digitale, grâce à Marlin, vous gardez le contrôle de tous les aspects, de l’hébergement à la sécurité en passant par le monitoring et les considérations légales. Cela vous permet d’assurer votre conformité vis-à-vis de l’ensemble des réglementations qu’elles soient internes ou gouvernementales (RGPD,…). À travers nos différents services, nous mettons à votre disposition notre vaste expérience associée aux bonnes pratiques sectorielles. Que ce soit par le biais d’une formation, d’un programme d’agence de certification ou de notre service d’aide en ligne 24/7.

Loïc Troquet, Responsable du pôle Excellence Technique pour www.pagesjaunes.fr, Solocal
Les activités Internet du Groupe Solocal s’articulent autour de deux lignes produits : Search Local et Marketing Digital. Avec le Search Local, le Groupe offre des services et des solutions digitales aux entreprises pour accroître leur visibilité et développer leurs contacts. Fort de son expertise, SoLocal Group compte aujourd’hui près de 490 000 clients et plus de 2,4 milliards de visites via ses 3 marques phares (PagesJaunes, Mappy et Ooreka), mais également par le biais de ses partenariats.

Théotime Leveque, Responsable DevOps chez Back Market,
Créée en novembre 2014, Back Market est la première place de marché qui permet aux consommateurs d'accéder à des milliers de produits techs remis à neuf, par des professionnels certifiés. Ces produits électriques et électroniques sont garantis de 6 à 24 mois, à des prix imbattables. Ouvert actuellement dans 6 pays, notre mission est de rendre mainstream la consommation de produits techs reconditionnés partout dans le monde.

Sébastien Aperghis-Tramoni, ingénieur systèmes chez Oscaro
Oscaro est le leader européen de la vente en ligne de pièces automobiles d’origine. Le Groupe n’a de cesse de faire évoluer l’outil ayant fait sa renommée : son catalogue unique de pièces détachées, le plus vaste et le plus complet du marché, avec près d'1 million de références. L’enjeu : la bonne pièce pour le bon véhicule pour la bonne personne.

Samuel Delplace, CIO chez Ankama
Créé en 2001, Ankama est aujourd'hui un groupe indépendant de création numérique spécialisé dans le domaine du divertissement et incontournable dans le monde du jeu vidéo ! Depuis le succès phénoménal en 2004, du jeu en ligne DOFUS (85 millions de comptes ont été créés dans le monde, dont plus de 40 millions en France). Ankama a investi plusieurs domaines d'activité pour devenir un véritable groupe transmédia.

Wednesday, 06 November

07:00

What’s new with Workers KV? [The Cloudflare Blog]

What’s new with Workers KV?
What’s new with Workers KV?

The Storage team here at Cloudflare shipped Workers KV, our global, low-latency, key-value store, earlier this year. As people have started using it, we’ve gotten some feature requests, and have shipped some new features in response! In this post, we’ll talk about some of these use cases and how these new features enable them.

New KV APIs

We’ve shipped some new APIs, both via api.cloudflare.com, as well as inside of a Worker. The first one provides the ability to upload and delete more than one key/value pair at once. Given that Workers KV is great for read-heavy, write-light workloads, a common pattern when getting started with KV is to write a bunch of data via the API, and then read that data from within a Worker. You can now do these bulk uploads without needing a separate API call for every key/value pair. This feature is available via api.cloudflare.com, but is not yet available from within a Worker.

For example, say we’re using KV to redirect legacy URLs to their new homes. We have a list of URLs to redirect, and where they should redirect to. We can turn this list into JSON that looks like this:

[
  {
    "key": "/old/post/1",
    "value": "/new-post-slug-1"
  },
  {
    "key": "/old/post/2",
    "value": "/new-post-slug-2"
  }
]

And then POST this JSON to the new bulk endpoint, /storage/kv/namespaces/:namespace_id/bulk. This will add both key/value pairs to our namespace.

Likewise, if we wanted to drop support for these redirects, we could issue a DELETE that has this body:

[
    "/old/post/1",
    "/old/post/2"
]

to /storage/kv/namespaces/:namespace_id/bulk, and we’d delete both key/value pairs in a single call to the API.

The bulk upload API has one more trick up its sleeve: not all data is a string. For example, you may have an image as a value, which is just a bag of bytes. if you need to write some binary data, you’ll have to base64 the value’s contents so that it’s valid JSON. You’ll also need to set one more key:

[
  {
    "key": "profile-picture",
    "value": "aGVsbG8gd29ybGQ=",
    "base64": true
  }
]

Workers KV will decode the value from base64, and then store the resulting bytes.

Beyond bulk upload and delete, we’ve also given you the ability to list all of the keys you’ve stored in any of your namespaces, from both the API and within a Worker. For example, if you wrote a blog powered by Workers + Workers KV, you might have each blog post stored as a key/value pair in a namespace called “contents”. Most blogs have some sort of “index” page that lists all of the posts that you can read. To create this page, we need to get a listing of all of the keys, since each key corresponds to a given post. We could do this from within a Worker by calling list() on our namespace binding:

const value = await contents.list()

But what we get back isn’t only a list of keys. The object looks like this:

{
  keys: [
    { name: "Title 1” },
    { name: "Title 2” }
  ],
  list_complete: false,
  cursor: "6Ck1la0VxJ0djhidm1MdX2FyD"
}

We’ll talk about this “cursor” stuff in a second, but if we wanted to get the list of titles, we’d have to iterate over the keys property, and pull out the names:

const keyNames = value.keys.map(e => e.name)

keyNames would be an array of strings:

[“Title 1”, “Title 2”, “Title 3”, “Title 4”, “Title 5”]

We could take keyNames and those titles to build our page.

So what’s up with the list_complete and cursor properties? Well, imagine that we’ve been a very prolific blogger, and we’ve now written thousands of posts. The list API is paginated, meaning that it will only return the first thousand keys. To see if there are more pages available, you can check the list_complete property. If it is false, you can use the cursor to fetch another page of results. The value of cursor is an opaque token that you pass to another call to list:

const value = await NAMESPACE.list()
const cursor = value.cursor
const next_value = await NAMESPACE.list({"cursor": cursor})

This will give us another page of results, and we can repeat this process until list_complete is true.

Listing keys has one more trick up its sleeve: you can also return only keys that have a certain prefix. Imagine we want to have a list of posts, but only the posts that were made in October of 2019. While Workers KV is only a key/value store, we can use the prefix functionality to do interesting things by filtering the list. In our original implementation, we had stored the titles of keys only:

  • Title 1
  • Title 2

We could change this to include the date in YYYY-MM-DD format, with a colon separating the two:

  • 2019-09-01:Title 1
  • 2019-10-15:Title 2

We can now ask for a list of all posts made in 2019:

const value = await NAMESPACE.list({"prefix": "2019"})

Or a list of all posts made in October of 2019:

const value = await NAMESPACE.list({"prefix": "2019-10"})

These calls will only return keys with the given prefix, which in our case, corresponds to a date. This technique can let you group keys together in interesting ways. We’re looking forward to seeing what you all do with this new functionality!

Relaxing limits

For various reasons, there are a few hard limits with what you can do with Workers KV. We’ve decided to raise some of these limits, which expands what you can do.

The first is the limit of the number of namespaces any account could have. This was previously set at 20, but some of you have made a lot of namespaces! We’ve decided to relax this limit to 100 instead. This means you can create five times the number of namespaces you previously could.

Additionally, we had a two megabyte maximum size for values. We’ve increased the limit for values to ten megabytes. With the release of Workers Sites, folks are keeping things like images inside of Workers KV, and two megabytes felt a bit cramped. While Workers KV is not a great fit for truly large values, ten megabytes gives you the ability to store larger images easily. As an example, a 4k monitor has a native resolution of 4096 x 2160 pixels. If we had an image at this resolution as a lossless PNG, for example, it would be just over five megabytes in size.

KV browser

Finally, you may have noticed that there’s now a KV browser in the dashboard! Needing to type out a cURL command just to see what’s in your namespace was a real pain, and so we’ve given you the ability to check out the contents of your namespaces right on the web. When you look at a namespace, you’ll also see a table of keys and values:

What’s new with Workers KV?

The browser has grown with a bunch of useful features since it initially shipped. You can not only see your keys and values, but also add new ones:

What’s new with Workers KV?

edit existing ones:

What’s new with Workers KV?

...and even upload files!

What’s new with Workers KV?

You can also download them:

What’s new with Workers KV?

As we ship new features in Workers KV, we’ll be expanding the browser to include them too.

Wrangler integration

The Workers Developer Experience team has also been shipping some features related to Workers KV. Specifically, you can fully interact with your namespaces and the key/value pairs inside of them.

For example, my personal website is running on Workers Sites. I have a Wrangler project named “website” to manage it. If I wanted to add another namespace, I could do this:

$ wrangler kv:namespace create new_namespace
Creating namespace with title "website-new_namespace"
Success: WorkersKvNamespace {
    id: "<id>",
    title: "website-new_namespace",
}

Add the following to your wrangler.toml:

kv-namespaces = [
    { binding = "new_namespace", id = "<id>" }
]


I’ve redacted the namespace IDs here, but Wrangler let me know that the creation was successful, and provided me with the configuration I need to put in my wrangler.toml. Once I’ve done that, I can add new key/value pairs:

$ wrangler kv:key put "hello" "world" --binding new_namespace
Success

And read it back out again:

> wrangler kv:key get "hello" --binding new_namespace
world

If you’d like to learn more about the design of these features, “How we design features for Wrangler, the Cloudflare Workers CLI” discusses them in depth.

More to come

The Storage team is working hard at improving Workers KV, and we’ll keep shipping new stuff every so often. Our updates will be more regular in the future. If there’s something you’d particularly like to see, please reach out!