Log in

Pete Zaitcev's Journal [entries|friends|calendar]
Pete Zaitcev

[ userinfo | livejournal userinfo ]
[ calendar | livejournal calendar ]

HP Reconfigurable [15 Dec 2015|11:36am]

I learned by way of Mirantis today that an entity known as "HP Enterprise" or "HPE" introduced something described thus:

It’s an architecture in which a large server acts as a “pool” of compute, storage, and networking resources, the same way a cloud might. When an application needs resources, they’re allocated from that hardware pool, and when the application goes away, they’re returned from the pool. All of this happens via the composable architecture.

That may explain the mysterious Intel computer that I saw in Tokyo. So it's not quite NUMA taken to extremes, it's also hardware domains taken to extremes.

[link] post comment

Cool hardware in Tokyo [04 Nov 2015|07:46pm]

At the Mitaka Summit, we finally got some interesting kit exhibited, after the relatively lean summits in Atlanta and Vancouver. Unfortunately, the lightning in the Marketplace was very weird and pictures came out poorly.

My personal favourite is probably the flash array by SanDisk. It's nothing but JBOF, the host connection is SAS. You'd think any idiot could slap a few flash chips on cards and plug them into backplane... But just look how elegant it is. The capacity of the 2U box is 512 TB, but the whole thing only consumes 700 W maximum. It's brilliant, really.

Unfortunately, I don't have a good picture, but the second best was Ericksson's passive optical backplane. It promises to make your cables last forever: just swap out optronics when new bit rates come along. Even a terabit! Now it may actually be a misguided product. If they cannot get 3rd party vendors to build modules for it, the whole things comes crashing to the ground. Ditto if they build, but overprice. But the audacity of making something that's different is to be acknowledged. And frankly I'm not a fan of re-cabling when new servers come about.

Intel wins a consolation prize for preservance. They quietly presented some kind of next-generation multiblock computer, with pieces connected by serial cables. Finally, the future dreamed by the creators of Infiniband is here - only 15 years late, and still we don't know if it is viable.

There was also a bunch of fairly mundane boxes. Various also-run flash vendors were present, of course. Interestingly, SolidFire had a booth, but without anything eye-catching. Resting on the laurels? IBM brought their newest PowerPC, which was mostly remarkable for still existing. That sort of thing.

[link] post comment

Darcy on the future of storage [27 Oct 2015|07:41pm]

Quick comment on the following:

Good morning, madam. What kind of storage system would you like me to build for you today?

Scary thought. That means that selling storage products is going to be hard for all of us. We'll be selling components, both hardware and software, or we'll be selling integration and support services. Somebody will always pay to have somebody else assemble the parts, maybe add some light customization, and support the result. There's a nice living to be made there... but no empires.

Why is it a problem that no empires are to be built? It's only a problem for an empire-builder like I dunno... Sam Altman or something. Darcy is an old engineer, not a startup founder. A good one, too. His kids aren't going to go to bed hungry.

We've been at this dance before with Linux. People have been asking if Red Hat was going to be like Microsoft, and I told everyone: nope. We're transfering the wealth that the proprietary lock-in vendors were collecting back to the users. That was the whole idea. In the process, we're collecting less - a more reasonable amount, necessary to put stuff together and make it run. Therefore, we're not going to be as wealthy off users' backs. But the society as a whole benefits.

So cry me a river. Not scary at all. But RTWT, I think he's drawing a truthful outline overall.

P.S. Another thing, what's magical about storage? Why, I can go build spacecraft when storage goes bust. Or whatever. Of course it's a pity for all the storage-specific techniques and skills that I accumulated, but eh. As long as we leave behind the good code (and docs), it's all good.

[link] 1 comment|post comment

Pics Up [05 Oct 2015|01:31pm]

Чёт я под настроение выложил картинки с этой недели на форумы Авиабазы. Anglophones are welcome to pictures at least.

[link] 1 comment|post comment

TLS Security In Firefox 40 [11 Sep 2015|12:33pm]

What do people at Mozilla think is going to happen when I need to access a website and Firefox says that TLS parameters are insecure and thus I cannot? I'm going to use Chrome, that's what. Or maybe even a hacked Midori, where I can adjust build-time parameters of gcr.

That company went way downhill when they kicked Eich out.

[link] 6 comments|post comment

Tablet Uber Alles Or Is It [14 Aug 2015|03:38pm]

Given the trouble with modern laptops, I'm seriously thinking if I should make a jump to a gigantic tablet with a keyboard. You run "make" on VM. Not enough RAM? Order in the cloud! The idea was planted in my mind by that jerk Atwood, who penned an article claiming a death of PC. And a month ago I saw someone at Python meetup using Canopy. It kinda worked, actually. I expect Github Atom to be even better.

Unfortunately, there are problems in 3 broad categories still.

First, the hotspot Internet connectivity sucks. It is plain unreliable. VPN, ssh, and IRC are often blocked; it's necessary to remember "Connectivity Through Anything" lessons and tehcniques. When it works, it's often slow. These problems extend to venues such as Intel's Executive Briefing Center. If "executives" eating their awesome snacks cannot obtain a decent WiFi, what hope do I have? I do not have cellphone data, but I hear bitching about it.

Second, the usual questions about privacy and security apply. Non-proprietary tablets suck immensely, from what I heard.

Third, tablets top out at 10..11 inch. Sorry, but that is not enough to kill laptops while laptops continue to be made. Certainly, Atwood made an argument that as tablets absorb users, PC makers will stop. The day the last one quits, we'll have to use the least shitty tablet regardless of size. But today is not that day.

UPDATE: 3 weeks after this post, Apple unveiled a 12.9" (2732 x 2048) iPad Pro, with a keyboard as a factory option.

[link] post comment

User-facing hardware [14 Aug 2015|03:18pm]

New business trip, new hardware pictures.

It was almost a year, and I'm still looking for a decent laptop, same criteria. I saw a couple of guys using Lenovo X1 Carbon, which looks good. Most importantly, the left Ctrl is now extends to its proper position. Almost a winner, but unfortunately, there are issues. Apparently, the screen on the X1 is not touching the main frame flat when it's closed, so a bundle of clothing pressing in the middle between the hinges is capable to making a nasty crack in plastic. Not acceptable for what is a $1,400 laptop even with Amazon's "discount" of $900. Way to go, Lenovo. Almost had me this time.

Meanwhile, a $500 Dell Vostro continues to soldier on. It's showing its age: building Ceph with "make -j${N}" requires more RAM that it has for any reasonable N, and dialog windows started to outgrow its screen (notably, some of GNOME preferences). I still need a laptop, but can't find a suitable one. The Lenovo X1 tops out at 8GB, which was another strike against it.

I was a little sad when Google stopped making Nexus 7. I have a 2013 version and it is quite good. In the same meeting, I bumped into a guy with a projected update to Nexus 7 that became orphaned when Google pulled the plug. ASUS continued to build them and market them as "MemoPad 7". However, taking the page from Microsoft playbook with their "Surface" and "Surface Pro", ASUS sell "MemoPad 7" versions ranging from worthless piece of junk with 1024x600 to actual Nexus 7 replacements with 1920x1200. Allegedly, the battery life and speed are much improved by using Intel's embedded Atom core. Some of the ARM-optimized apps may not work (example is some kind of music editing thing for podcasters).

[link] post comment

git submodule [10 Aug 2015|09:08pm]

It's a familiar sign to anyone dealing with a project that includes submodules: you run "make" and see something like this:

rgw/rgw_main.cc: In member function ‘virtual int RGWMongooseFrontend::run()’:
rgw/rgw_main.cc:993:8: error: ‘struct mg_callbacks’ has no member named ‘log_access’
cb.log_access = rgw_civetweb_log_access_callback;

Ah, yes. Submodule civetweb is obviously out of date. Type "git submodule init; git submodule update" and... nothing happens. The goddamn submodules are stuck.

At this point, running "git diff origin" produces an output like:

--- a/ceph-object-corpus
+++ b/ceph-object-corpus
@@ -1 +1 @@
-Subproject commit 20351c6bae6dd4802936a5a9fd76e41b8ce2bad0
+Subproject commit bb3cee6b85b93210af5fb2c65a33f3000e341a11

So yeah, obviously you fetched the right thing from the origin, but you cannot merge or rebase no matter what. You may spend a good part of a hackathon reading man pages for git subcommands, all for naught.

Fortunately, the stuck submodules can be worked around, by looking at the "git diff origin" above, then doing this:

git update-index --replace --cacheinfo 160000,20351c6bae6dd4802936a5a9fd76e41b8ce2bad0,ceph-object-corpus

You get the idea: force the right commit from the origin into the local index. This allows "git submodule update" to clone and checkout the right thing and you're off to the races. The fixups in the index will stick out in "git status", so create an empty commit to get rid of them (but only after "git submodule update").

When you're done, you might want to kick in the nuts whoever chose to use submodules in your project.

P.S. "git --version" yields "git version 2.4.3".

P.P.S. You verify what you have in the index by running "git ls-files -s ceph-object-corpus" (or src/civetweb). The mode must be 160000 and the hash should match the upstream. Note that "git diff origin" continues to display a disparity until you've run the "git submodules update".

[link] post comment

the future is here [10 Aug 2015|08:12pm]


10005 zaitcev   20   0  809920 755384  13220 R  99.7 12.5   0:20.47 cc1plus
 9894 zaitcev   20   0 1946748 1.806g  15800 R  99.3 31.4   1:46.60 cc1plus
 9956 zaitcev   20   0 1652076 1.524g  15832 R  99.0 26.5   1:30.64 cc1plus
   72 root      20   0       0      0      0 S   4.0  0.0   0:04.60 kswapd0
 9957 zaitcev   20   0   56648  43536   1436 S   2.7  0.7   0:00.49 as
 9895 zaitcev   20   0   79480  66368   1480 S   2.0  1.1   0:00.89 as
 2870 zaitcev   20   0 1989524 533104 160868 S   1.3  8.9  60:28.10 firefox
 2035 zaitcev   20   0 2018216 166872  20028 S   0.7  2.8  16:50.66 gnome-sh

That's right, boys and girls, a compiler with a bigger resident size than Firefox. Three times bigger.

[link] 3 comments|post comment

DNF - Debugging Not Finished [02 Aug 2015|05:12pm]

It's 100% like CKS said:

[root@kvm-rei zaitcev]# dnf check-update openstack-swift
Last metadata expiration check performed 0:10:02 ago on Sun Aug  2 18:42:13 2015.

openstack-swift.noarch                   2.3.0-2.fc23                    rawhide
[root@kvm-rei zaitcev]# dnf update openstack-swift
Last metadata expiration check performed 0:10:07 ago on Sun Aug  2 18:42:13 2015.
Dependencies resolved.
Nothing to do.
[root@kvm-rei zaitcev]# rpm -q openstack-swift
[root@kvm-rei zaitcev]# 

Searching for a good way to make it unstuck.

[link] 3 comments|post comment

Conference submission and voting [29 Jul 2015|10:23am]

Generally I feel that I do not do any work that's important enough to present at conferences. My previous presentation was at OLS back in 2005, concerning usbmon. The usbmon is something a guy learning C would program: it's a circular buffer into which kernel drops tracing events; Wireshark pulls them out. Hardly a conference material, but at the time I thought it was supremely important to proseltize the basic techniques of always-on tracing, because it would improve the quality and the ease of debugging of the kernel overall. I really wanted FireWire guys to adopt a similar tracing scheme, because it was a hell on a stick debugging juju with just printk(). Needless to say, that was a miserable failure, as was FireWire itself. I don't think anyone who came to listen to my presentation in Ottawa received their money's worth.

Or did they? Recently an epiphany occured to me. I really should not even think if anyone is interested. That is conference organizers' job, not mine! As a result, I sent a proposal to OpenStack Tokyo, entitled "The Plot to Destroy OpenStack Swift Using C++: Enhancements of Swift API Compatibility in Ceph RADOS Gateway". It's basically a compendum of practical issues that occur when running Swift apps on top of Ceph RGW and what we do to help people do that.

The things are a little different from 10 years ago, because attendees can vote on the submissions. This sounds democratic. I went through all submissions on the storage track and voted them according to my preference. It took a very long time and I suspect that I was crowdsourced by the organizers in the best traditions of Web 2.0. I wonder if they'll even read the abstracts. :-)

[link] post comment

Fedora 22 killed IPv6 and I'm fine [20 Jul 2015|10:18am]

I upgraded Fedora on my home router to F22 and immediately IPv6 disappeared on the internal network. The problem is that radvd started throwing its usual "no linklocal address configured on ethmain.5" (although the message is only visible with "IgnoreIfMissing off;"), which leads to "interface ethmain.5 does not exist or is not set up properly". With the default IgnoreIfMissing, radvd continues running but refuses to work, quietly. Needless to say, the interface has a perfectly valid link-local address, same as it had in F21 before the upgrade.

There used to be a time when I took a problem like this as an affront to the idea of IPv6 superiority and the reputation of Fedora as a platform for roll-your-own home router. Now though, I don't give a rat's tail for IPv6. Let Comcast and Google care and pay someone to care. Okay, I lied. I cared enough for file a bug 1244428, but I'm not rushing to build from SRPMs, reinstall old versions, and such.

(Frankly, if we just engage Lennart's attention for an hour, he'll incoporate a perfectly serviceable radvd function into systemd-networkd. Of course, one would need journalctl to see any messages from it, but since it is certain to work, nobody would actually attempt that. The bug report would go unanswered like radvd bugs today, but again, there would not be any bugs.)

UPDATE: The root cause turned out to be an incorrect link-local address after all. I presumed that RFC meant whole fe80::/10 prefix to be used, so each interface had a different address within a node. Ergo, fe80:0:0:1::1/64 address. As it turns out, I may have confused Link-Local address with Site-Local address. The RFC 4291 specifies a fe80::/64 prefix and in F22 the radvd started enforcing it. Note that apparently the lower part has to be unique across the link.

[link] post comment

Rich and comments [11 Jun 2015|11:38am]

Rich Jones posted an article about being banned by Boing-Boing, supposedly for bringing attention to their use of affiliate links (the practice that Gamergate groups criticized as well — and scored a regulatory win against). Meanwhile, all my comments at Rich's blog are blackholed, which is quite ironic. Generally, I am not into this "blog comment" thing. Ani-nouto never had any comments and is doing great that way. But some people like comments, so I leave them as necessary.

[link] post comment

Cool hardware in Vancouver [29 May 2015|12:38pm]

There wasn't much, but more than in Atlanta. The most "pro" looking kit was presented by NEC: basically a bladeserver, but the "blades" are SBCs, each of them accompained by a dedicated drive card. I can see downsides of this design, but very cute.

Unfortunately, they only offer CPU cards based on Atom. No ARM or anything.

The only other interesting booth belonged to StackVelocity, a subsidiary of JB Circuits that does custom design.

I'm sorry to say, their wares looked decidedly pedestrian, which is to be expected: their sales point is low cost, and stuff of that nature underpins the modern datacenter. One curious thing, however, is the variety of flash cards they offer. Basically Fusion-IO on budget. One was particularly tricky by having 2 layers. At first I even thought it could have flash chips mounted sideways, but nope, the science of low-cost computing is not there yet.

P.S. NEC also sell the same chassis with CPU cards instead of drive cards under the index "DX1000".

[link] post comment

Semi-hard numbers from Rackspace [29 May 2015|12:11pm]

Previously in hard numbers: China, Wikimedia, Amazon S3. Rackspace previously reported in creiht's preso 18 months ago. This time, scotty went public at the Vancouver (Liberty) summit with the following:

> 50 billion objects
> 100 PB data (sanitized number, but way higher than 85 PB)
= 6 global clusters
3:1 PUT:GET ratio
10k+ requests/second

The number of objects is roughly 40 times less than in Amazon S3.

[link] post comment

How Mitchell Baker made me to divorce [06 May 2015|02:10pm]

Well, nearly did. Deleting history in Firefox 37 is very slow and the UI locks up while you do that. "Very slow" means an operation that takes 13 minutes (not exaggerating - it's reproducible). The UI lock-up means a non-dismissable context menu floating over everything; Firefox itself being, of course, entirely unresponsive. See the screencap.

The screencap is from Linux where I confirmed the problem, but the story started on Windows, where my wife tried to tidy up a bit. So, when Firefox locked up, she killed it, and repeated the process a few times. And what else would you do? We are not talking about hanging up for seconds - it literally was many minutes. Firefox did not pop a dialog with "Please wait, deleting 108,534 objects with separate SQLite transactions", a progress gauge, and a "Cancel" button. Instead, it pretended to lock up.

Interestingly enough, remember when Firefox had a default to keep the history for a week? This mode is gone now - FF keeps the history potentially forever. Instead, it offers a technical limit: 108,534 entries are saved in the "Places" database at the most, in order to prevent SQLite from eating all your storage. Now I understand why my brown "visited" links never go back to blue anymore.

The problem is, there's no alternative. I tried to use Midori as my main browser for a month or two in early 2014, but it was a horrible crash city. I had no choice but to give up and go back to Firefox and its case of Featuritis Obesum.

[link] post comment

UAT on RTL-SDR update [15 Jan 2015|12:52pm]

About a year ago, when I started playing with ADS-B over 1090ES, I noticed that small airplanes heavily favour UAT in 978 MHz, because it's cheaper. For the purposes of independent on-board traffic, thus, it would be important to tap directly into UAT. If I mingle with airliners and their 1090ES, it's in controlled airspace anyway (yes, I know that most collisions happen in controlled airspace, but I'm grasping at excuses to play with UAT here, okay).

UAT poses a large challenge for RTL-SDR because of its relatively high data rate: 1.041667 mbit/s. RTL 2832U can only sample at 3.2 MS/s, and is only stable at 2.8 MS/s. Theoretically it should be enough, but everything I saw out there only works with 8 samples per bit, for weak signals. So, for initial experiments, I thought to try a trick: self-clocking. I set the sample rate to 2083334, and then do no clock recovery whatsoever. It hits where it hits, and if sample points lay well onto bits, the packet is recovered, otherwise it's lost.

I threw together some code, ran it, and it didn't work. Only received white noise. Oh, well. I moved the repo into a dusty corner of Github and forgot about it.

Fast forward a year, a gentleman by the name Oliver Jowett noticed a big problem: the phase was computed incorrectly. I fixed that up and suddenly saw some bits recovered (as it happened, zeroes and ones were swapped, which Oliver had to correct again).

After that, things started to move forward. Having bits recovered allowed to measure reception, and I found that the antenna that I built for the 978 MHz band was much worse than the stock antenna for TV. Imagine my surprise and disappoinment: all that soldering for nothing. I don't know where I screwed up, but some suggest that the computer and dongle produce RF noise that screws with antenna, and a length of coax helps with that despite the losses incurred by the coax (/u/christ0ph liked that in parti-cular).

This is bad.

This is good. Or better at least.

From now on, it's the error recovery. Unfortunately, I have no clue what this means:

The FEC parity generation shall be based on a systematic RS 256-ary code with 8-bit code word symbols. FEC parity generation for each of the six blocks shall be a RS (92,72) code.

Quoted from Annex 10, Volume III,

P.S. Observing Oliver's key involvement, the cynical may conclude that the whole premise of Open Source is that you write drek that does not work, upload it to Github, and someone fixes it up for you, free of charge. ESR wrote a whole book about it ("with enough eyes all bugs are shallow")!

[link] 2 comments|post comment

Swift and balance [08 Jan 2015|11:17am]

Swift is on the cusp of getting yet another intricate mechanism that regulates how partitions are placed: so-called "overload". But new users of Swift even keep asking what weights are, and now this? I am not entirely sure it's necessary, but here's a simple explanation why we're ending with attempts at complexity (thanks to John Dickinson on IRC).

Suppose you have a system that spreads your replicas (of partitions) across (failure) zones. This works great as long as your zones are about the same size, usually a rack. But then one day you buy a new rack with 8TB drives and suddenly the new zone is several times larger than others. If you do not adjust anything, it ends only filled by quarter at best.

So, fine, we add "weights". Now a zone that has weight 100.0 gets 2 times more replicas than zone with weight 50.0. This allows you to fill zones better, but this must compromize your dispersion and thus durability. Suppose you only have 4 racks: three with 2TB drives and one with 8TB drives. Not an unreasonable size for a small cloud. So, you set weights to 25, 25, 25, 100. With replication factor of 3, there's still a good probability (which I'm unable to calculate, although I feel it ought to be easy for someone better educated) that the bigger node will end with 2 replicas for some partitions. Once that node goes down, you lose redundancy completely for those partitions.

In the small-cloud example above, if you care about your customers' data, you have to eat the imbalance and underutilization until your retire the 2TB drives [1].

<clayg> torgomatic: well if you have 6 failure domains in a tier but their sized 10000 10 10 10 10 10 - you're still sorta screwed

My suggestion would be just ignore all the complexity we thoughtfuly provided for the people with "screwed" clusters. Deploy and maintain your cluster to make it easy for the placement and replication: have a good number of more or less uniform zones that are well aligned to natural failure domains. Everything else is a workaround -- even weights.

P.S. Kinda wondering how Ceph deals with these issues. It is more automagic when it decides what to store where, but surely there ought to be a good and bad way to add OSDs.

[1] Strictly speaking, other options exist. You can delegate to another tier by tying 2 small racks into a zone: yet another layer of Swift's complexity. Or, you could put new 8TB drives on trays and stuff them into existing nodes. But considering that muddies the waters.

UPDATE: See the changelog for better placement in Swift 2.2.2.

[link] post comment

Going beyond ZFS by accident [05 Jan 2015|11:25am]

Yesterday, CKS wrote an article that tries to redress the balance a little in the coverage of ZFS. Apparently, detractors of ZFS were pulling quotes from his operational gripes, so he went forceful with the observation that ZFS remains the only viable advanced filesystem (on Linux or not). CKS has no time for your btrfs bullshit.

The situation where weselovskys of the world hold the only viable advanced filesystem hostage and call everyone else "jackass" is very sad for Linux, but it may not be quite so dire, because it's possible that events are overtaking btrfs and ZFS. I am talking about the march of super-advanced, distributed filesystems downstream.

It all started with the move beyond POSIX, which, admittedly, seemed very silly at the time. The early DHT was laughable and I remember how they struggled for years to even enable writes. However, useful software was developed since then.

The poster child of it is Sage's Ceph, which relies on plain old XFS for back-end storage, composes an object storage out of nodes (called RADOS), and layers a POSIX layer on top for those who want it. It is in field use at Dreamhost. I can easily see someone using it where otherwise a ZFS-backed NFS/CIFS cluster would be deployed.

Another piece of software that I place in the same category is OpenStack Swift. I know, Swift is not competing with ZFS directly. The consistency of its meta layer is not sufficient to emulate POSIX anyway. However, you get all those built-in checksums and all that durability jazz that CKS wants. And Swift aims even further up in scale than Ceph, by being in field use at Rackspace. So, what seems to be happening is that folks who really need to go large are willing at times to forsake even compatibility with POSIX, in part of get the benefits that ZFS provides to CKS. Mercado Libre is one well-hyped case of migration from a pile of NFS filers to a Swift cluster.

Now that these systems are established and have themselves proven, I see constant efforts to take them downscale. Original Swift 1.0 did not even work right if you had less than 3 nodes (strictly speaking, if you had fewer zones than replication factor). This was fixed since by so-called "as good as possible placement" around 1.13 Havana, so you can have 1-node Swift easily. Ceph, similarly, would not consider PGs on the same node healthy and it's a bit of a PITA even in Firefly. So yea, there are issues, but we're working on it. And in the process, we're definitely coming for chunks of ZFS space.

[link] 2 comments|post comment

blitz2 for GNOME catastrophy [05 Jan 2015|09:27am]

After putting blitz2 on my Nexus and poking into GUI buttons, I reckoned that it might be time to stop typing in a terminal like a caveman on Linux too (I'm joking, but only just). And the project rolled smoothly for a while. What it took me 2 months to accomplish in Android, only took 2 days in GNOME. However, 2 lines of code from the end, it all came to an abrubt halt when I found out that it is impossible to access clipboard from a JavaScript application. See GNOME bugs 579312, 712752.

[link] post comment

[ viewing | most recent entries ]
[ go | earlier ]