Pete Zaitcev ( zaitcev) wrote,
Pete Zaitcev
zaitcev

gcc overoptimizations

So, here are two cases which happened recently.

First case, Jan Glauber posts this:

--- linux-2.6.9/fs/read_write.c.ppos
+++ linux-2.6.9/fs/read_write.c
@@ -188,6 +188,7 @@ ssize_t do_sync_read(struct file *filp, 
 
 	init_sync_kiocb(&kiocb, filp);
 	kiocb.ki_pos = *ppos;
+	barrier();
 	ret = filp->f_op->aio_read(&kiocb, buf, len, kiocb.ki_pos);
 	if (-EIOCBQUEUED == ret)
 		ret = wait_on_sync_kiocb(&kiocb);

The aio.c in 2.6.19 is substantially different, so perhaps the problem is not triggered there. But the lesson is still needed to be learned.

What happens here is, gcc uses *ppos for the aio_read argument. So, thread A enters do_sync_read, initializes kiocb, including initializing ki_pos from *ppos. Then, thread B changes *ppos. When thread A calls ->aio_read, it uses the new value, inconsistent with the one in kiocb. Something down the road calculates wrong array index and oopses.

I'm sure C purists will argue that gcc is not doing anything illegal here. And I have to warn that blog comments mentioning "volatile" will be deleted with extreme prejustice.

Second case is an unaligned access in proc connector on ia64:

        __u8 buffer[CN_PROC_MSG_SIZE];

        msg = (struct cn_msg*)buffer;
        ev = (struct proc_event*)msg->data;

        ktime_get_ts(&ts); /* get high res monotonic timestamp */
        ev->timestamp_ns = timespec_to_ns(&ts);
        ev->what = PROC_EVENT_EXEC;

The cn_msg is set so that it makes its payload aligned to 4 bytes, whereas the timestamp_ns is a 8 byte value. Fair enough, it's a broken API design. But then, this does not work either:

        __u8 buffer[CN_PROC_MSG_SIZE];
        u64 timestamp_ns;

        msg = (struct cn_msg*)buffer;
        ev = (struct proc_event*)msg->data;

        ktime_get_ts(&ts); /* get high res monotonic timestamp */
        timestamp_ns = timespec_to_ns(&ts);
        memcpy(&ev->timestamp_ns, ×tamp_ns, sizeof(ev->timestamp_ns));
        ev->what = PROC_EVENT_EXEC;

Apparently, gcc optimizes memcpy into "st8 [r39]=r40". Either Erik mistakenly ran an unmodified kernel, or gcc is completely out to lunch.

May we have our gcc-2.96 back please?

Tags: advogato, linux
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 6 comments