POST, PUT, and CRUD

Anyone who ever worked with object storage knows that PUT creates, GET reads, POST updates, and DELETE deletes. Naturally, right? POST is such a strange verb with oddball encodings that it's perfect to update, while GET and PUT are matching twins like read(2) and write(2). Imagine my surprise, then, when I found that the official definition of RESTful makes POST create objects and PUT update them. There is even a FAQ, which uses sophistry and appeals to the authority of RFCs in order to justify this.

So, in the world of RESTful solipcism, you would upload an object foo into a bucket buk by issuing "POST /buk?obj=foo" [1], while "PUT /buk/foo" applies to pre-existing resources. Although, they had to admit that RFC-2616 assumes that PUT creates.

All this goes to show, too much dogma is not good for you.

[1] It's worse, actually. They want you to do "POST /buk", and receive a resource ID, generated by the server, and use that ID to refer to the resource.

Comment to 'О цифровой экономике и глобальных проблемах человечества' by omega_hyperon

Я это всё каждый день слышу. Эти люди берут вполне определившуюся тенденцию к замедлению научно-технического прогресса, и говорят - мы лучше знаем, что человечеству нужно. Отберите деньги у недостойных, и дайте таким умным как я, и прогресс снова пойдёт. А заодно защитим природу! И всегда капитализм виноват.

О том, что цивилизация топчется на месте, спору нет. А вот пара вещей о которых этот гандон умалчивает.

Во-первых, если отнять деньги у Диснея и отдать их исследовательскому институту, то денег не будет. Вроде по-русски говорит, а такого простого урука из распада СССР не вынес. Американская наука разгромила советскую науку во времена НТР прежде всего потому, что капиталистическая экономика предоставила экономическую базу для этой науки, а социалистическая экономика была провальной.

Вообще, если сравнить бюджет Эппла и Диснея с бюджетом Housing and Urban Development и аналогичных учреждений, то там разница на 2 порядка. Если кто-то хочет дать науке больше денег, то нужно не грабить Дисней, а прекратить давать халявщикам бесплатное жильё. Замедление науки и капитализма идут рука об руку и вызваны государственной политикой, а не каким-то там биткойном.

Во-вторых, кто вообще верит этим шарлатанам? Нам забивали баки про детей в Африке десятилетиями, а за время ужасного голода в Эфиопии её население увеличилось с 38 миллионов до 75 миллионов. То же самое произошло с белыми медведями. Площадь лесов на планете растёт. Допустим в Бразилии срубили какие-то леса под паздбища... Но кто в это поверит?

Этот кризис экспертизы - не шутка. Боязнь вакцин создана не капитализмом и биткойном, а загниванием и распадом системы научных исследований в целом. Он не назвал институт, бюджет которого он сравнил с Диснеем, а вот интересно, сколько там бездельников среди сотрудников.

Коллапс науки отражается не только в том как публика утратила веру в учёних. Объективные показатели тоже просели. Подтверждаемость публикаций очень плохая, и идёт вниз. Тоже биткойн виноват?

В обшцем большая часть этого нытя мне видится крайне вредной. Если он не в состоянии диагностировать причины кризиса, предлогаемые решения ничего нам не дадут, и биодиверии не прибавится.

View the entire thread this comment is a part of

Swift is 2 to 4 times faster than any competitor

Or so they say, at least for certain workloads.

In January of 2015 I led a project to evaluate and select a next-generation storage platform that would serve as the central storage (sometimes referred to as an active archive or tier 2) for all workflows. We identified the following features as being key to the success of the platform:

  • Utilization of erasure coding for data/failure protection (no RAID!)
  • Open hardware and the ability to mix and match hardware (a.k.a. support heterogeneous environments)
  • Open source core (preferred, but not required)
  • Self-healing in response to failures (no manual processes required, like replacing a drive)
  • Expandable online to exabyte-scale (no downtime for expansions or upgrades)
  • High availability / fault tolerance (no single point of failure)
  • Enterprise-grade support (24/7/365)
  • Visibility (dashboards to depict load, errors, etc.)
  • RESTful API access (S3/Swift)
  • SMB/NFS access to the same data (preferred, but not required)

In hindsight, I wish we would have included two additional requirements:

  • Transparently tier and migrate data to and from public cloud storage
  • Span multiple geographic regions while maintaining a single global namespace

We spent the next ~1.5 years evaluating the following systems:

  • SwiftStack
  • Ceph (InkTank/RedHat/Canonical)
  • Scality
  • Cloudian
  • Caringo
  • Dell/EMC ECS
  • Cleversafe / IBM COS
  • HGST/WD ActiveScale
  • NetApp StorageGRID
  • Nexenta
  • Qumulo
  • Quantum Lattus
  • Quobyte
  • Hedvig
  • QFS (Quantcast File System)
  • AWS S3
  • Sohonet FileStore

SwiftStack was the only solution that literally checked every box on our list of desired features, but that’s not the only reason we selected it over the competition.

The top three reasons behind our selection of SwiftStack were as follows:

  • Speed – SwiftStack was—by far—the highest-performing object storage platform — capable of line speed and 2-4x faster than competitors. The ability to move assets between our “tier 1 NAS” and “tier 2 object” with extremely high throughput was paramount to the success of the architecture.
  • [...]

Note: While SwiftStack 1space was not a part of the SwiftStack platform at the time of our evaluation and purchase, it would have been an additional deciding factor in favor of SwiftStack if it had been.

Interesting. It should be noted that performance of Swift is a great match for some workloads, but not for others. In particluar, Swift is weak on small-file workloads, such as Gnocchi, which writes a ton of 16-byte objects again and again. The overhead is a killer there, and not just on the wire: Swift has to update its accounting databases each and every time a write is done, so that "swift stat" shows things like quotas. Swift is also not particularly good at HPC-style workloads, which benefit from a great bisectional bandwidth, because we transfer all user data through so-called "proxy" servers. Unlike e.g. Ceph, Swift keeps the cluster topology hidden from the client, while a Ceph client actually tracks the ring changes, placement groups and their leaders, etc.. But as we can see, once the object sizes start climbing and the number of clients increases, Swift rapidly approaches the wire speed.

I cannot help noticing that the architecture in question has a front-facing cache of pool (tier 1), which is what the ultimate clients see instead of Swift. Most of the time, Swift is selected for its ability to serve tens of thousands of clients simultaneously, but not in this case. Apparently, the end-user invented ProxyFS independently.

There's no mention of Red Hat selling Swift in the post. Either it was not part of the evaluation at all, or the author forgot about it for the passing of time. He did list a bunch of rather weird and obscure storage solutions though.

PostgreSQL and upgrades

As mentioned previously, I run a personal Fediverse instance with Pleroma, which uses Postgres. On Fedora, of course. So, a week ago, I went to do the usual "dnf distro-sync --releasever=30". And then, Postgres fails to start, because the database uses the previous format, 10, and the packages in F30 require format 11. Apparently, I was supposed to dump the database with pg_dumpall, upgrade, then restore. But now that I have binaries that refuse to read the old format, dumping is impossible. Wow.

A little web searching found an upgrader that works across formats (dnf install postgresql-upgrade; postgresql-setup --upgrade). But that one also copies the database, like a dump-restore procedure would. What if the database is too large for this? Am I the only one who finds these practices unacceptable?

Postgres was supposed to be a solid big brother to a boisterous but unreliable upstart MySQL, kind of like Postfix and Exim. But this is just such an absurd fault, it makes me think that I'm missing something essential.

UPDATE: Kaz commented that a form of -compat is conventional:

When I've upgraded in the past, Ubuntu has always just installed the new version of postgres alongside the old one, to allow me to manually export and reimport at my leisure, then remove the old version afterward. Because both are installed, you could pipe the output of one dumpall to the psql command on the other database and the size doesn't matter. The apps continue to point at their old version until I redirect them.

Yeah, as much as I can tell, Fedora does not package anything like that.

Pi-hole

With the recent move by Google to disable the ad-blockers in Chrome (except for Enterprise level customers[1]), the interest is sure to increase for methods of protection against the ad-delivered malware, other than browser plug-ins. I'm sure Barracuda will make some coin if it's still around. And on the free software side, someone is making an all-in-one package for Raspberry Pi, called "Pi-hole". It works by screwing with DNS, which is actually an impressive demonstration of what an attack on DNS can do.

An obvious problem with Pi-hole is what happens to laptops when they are outside of the home site protection. I suppose one could devise a clone of Pi-hole that plugs into the dnsmasq. Every Fedora system runs one, because NM needs it in order to support the correct lookup on VPNs {Update: see below}. The most valuable part of Pi-hole is the blocklist, the rest is just scripting.

[1] "Google’s Enterprise ad-blocking exception doesn’t seem to include G Suite’s low and mid-tier subscribers. G Suite Basic is $6 per user per month and G Suite Business is $12 per user month."

UPDATE: Ouch. A link by Roy Schestovitz made me remember how it actually worked, and I was wrong above: NM does not run dnsmasq by default. It only has a capability to do so, if you want DNS lookup on VPNs work correctly. So, every user of VPN enables "dns=dnsmasq" in NM. But it is not the default.

UPDATE: A reader mentions that he was rooted by ads served by Space.com. Only 1 degree of separation (beyond Windows in my family).

Google Fi

Seen an amusing blog post today on the topic of the hideous debacle that is Google Fi (on top of being a virtual network). Here's the best part though:

About a year ago I tried to get my parents to switch from AT&T to Google Fi. I even made a spreadsheet for my dad (who likes those sorts of things) about how much money he could save. He wasn’t interested. His one point was that at anytime he can go in and get help from an AT&T rep. I kept asking “Who cares? Why would you ever need that?”. Now I know. He was paying almost $60 a month premium for the opportunity to able to talk to a real person, face-to-face! I would gladly pay that now.

Respect your elders!

YAML

Seen in a blog entry by Martin Tournoij (via):

I’ve been happily programming Python for over a decade, so I’m used to significant whitespace, but sometimes I’m still struggling with YAML. In Python the drawbacks and loss of clarity are contained by not having functions that are several pages long, but data or configuration files have no such natural limits to their length.

[...]

YAML may seem ‘simple’ and ‘obvious’ when glancing at a basic example, but turns out it’s not. The YAML spec is 23,449 words; for comparison, TOML is 3,339 words, JSON is 1,969 words, and XML is 20,603 words.

There's more where the above came from. In particular, the portability issues are rather surprising.

Unfortunately for me, OpenStack TripleO is based on YAML.

Fraud in the material world

Wow, they better not be building Boeings from this crap:

NASA Launch Services Program (LSP) investigators have determined the technical root cause for the Taurus XL launch failures of NASA’s Orbiting Carbon Observatory (OCO) and Glory missions in 2009 and 2011, respectively: faulty materials provided by aluminum manufacturer, Sapa Profiles (SPI). LSP’s technical investigation led to the involvement of NASA’s Office of the Inspector General and the U.S. Department of Justice (DOJ). DOJ’s efforts, recently made public, resulted in the resolution of criminal charges and alleged civil claims against SPI, and its agreement to pay $46 million to the U.S. government and other commercial customers. This relates to a 19-year scheme that included falsifying thousands of certifications for aluminum extrusions to hundreds of customers.

BTW, those costly failures probably hastened the sale of Orbital to ATK in 2015. There were repercussions for the personnell running the Taurus program as well.

Swift(stack) bragging today

On their official blog:

Over the last several months, SwiftStack has been busy helping two large autonomous vehicle customers. These data pipelines are distributed across edge (vehicle sensors) to core (data center) to cloud/multi-cloud locations, and are challenged with ingest, labeling, training, inferencing, and retaining data at scale. [...] one deployment is handling more than a petabyte of data per week, with four thousand GPU cores from NVIDIA DGX-1 servers fed with 100 GB/s of throughput from SwiftStack cluster.

I suspect the task-queue expirer could be helpful at this. Although, if you're uploading 1 PB per week, it takes about a year to fill out a cluster as big as Turkcell's.

Apparently the actual storage is provided by Cisco UCS S3260. Some of our customers use Cisco UCS to run Swift too. I always thought about Cisco as a networking company, but it's different nowadays.

Suddenly RISC-V

I knew about that thing because Rich Jones was a fan. Man, that guy is always ahead of the curve.

Coincidentially, a couple of days ago Amazon announced support for RISC-V in FreeRTOS (I have no idea how free that thig is. It's MIT license, but with Amazon, it might be patented up the gills.).