I suspect people trying to find alternate CPU architectures that don't suffer from #Spectre - like bugs have misunderstood how fundamental the problem is.Your CPU will not go fast without caches. Your CPU will not go fast without speculative execution. Solving the problem will require more silicon, not less. I don't think the market will accept the performance hit implied by simpler architectures. OS, compiler and VM (including the browser) workarounds are the way this will get mitigated.
CPUs will not go fast without caches and speculative execution, you say? Prof. Babayan may have something to say about that. Back when I worked under him in the 1990s, he considered caches a primitive workaround.
The work on Narch was informed by the observation that the submicron feature size provided designers with more silicon they knew what to do with. So, the task of a CPU designer was to identify ways to use massive amounts of gates productively. But instead, mediocre designers simply added more cache, even multi-level cache.
Talking about it was not enough, so he set out to design and implement his CPU, called "Narch" (later commercialized as "Elbrus-2000"). And he did. The performance was generally on par with its contemporaries, such as Pentium III and UltraSparc. It had a cache, but measured in kilobytes, not megabytes. But there were problems beyond the cache.
The second part of the Bee Yarn Knee's objection deals with the speculative execution. Knocking that out required a software known as a binary translator, which did basically the same thing, only in software[*]. Frankly at this point I cannot guarantee that it weren't possible to abuse that mechanism for unintentional signaling in the same ways Meltdown works. You don't have cache for timing signals in Narch, but you do have the translator, which can be timed if it runs at run time like in Transmeta Crusoe. In Narch's case it only ran ahead of time, so not exploitable, but the result turned out to be not fast enough for workloads that make a good use of speculative execution today (such as LISP and gcc).
Still, I think that a blanket objection that CPU cannot run fast with no cache and no speculative execution, IMHO, is informed by ignorance of alternatives. I cannot guarantee that E2k would solve the problem for good, after all its later models sit on top of a cache. But at least we have a hint.
[*] The translator grew from a language toolchain and could be used in creative ways to translate source. It would not be binary in such case. I omit a lot of detail here.
UPDATE: Oh, boy:
But the speedup from speculative execution IS from parallelism. We're just asking the CPU to find it instead of the compiler. So couldn't you move the smarts into the compiler?
Sean, this is literally what they said 30 years ago.