Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QEMU booting in endless looping at post-03 #1301

Open
yu1hpa opened this issue Mar 1, 2024 · 19 comments
Open

QEMU booting in endless looping at post-03 #1301

yu1hpa opened this issue Mar 1, 2024 · 19 comments

Comments

@yu1hpa
Copy link

yu1hpa commented Mar 1, 2024

When I implemented until Newlines section at post-03 and then boot the QUME, cause endless loop like following.

Screen.Recording.2024-03-01.at.4.50.06.mov

I think if I add a volatile type implementation, they won't work.
Could you tell me how to resolve?
Thank you.

Environment

macOS Sonoma
QEMU emulator version 8.2.1
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers

Following is source code(vga_buffer.rs)

use volatile::Volatile;

#[allow(dead_code)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub enum Color {
    Black = 0,
    Blue = 1,
    Green = 2,
    Cyan = 3,
    Red = 4,
    Magenta = 5,
    Brown = 6,
    LightGray = 7,
    DarkGray = 8,
    LightBlue = 9,
    LightGreen = 10,
    LightCyan = 11,
    LightRed = 12,
    Pink = 13,
    Yellow = 14,
    White = 15,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(transparent)]
struct ColorCode(u8);

impl ColorCode {
    fn new(foreground: Color, background: Color) -> ColorCode {
        ColorCode((background as u8) << 4 | (foreground as u8))
    }
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(C)]
struct ScreenChar {
    ascii_character: u8,
    color_code: ColorCode,
}

const BUFFER_HEIGHT: usize = 25;
const BUFFER_WIDTH: usize = 80;

#[repr(transparent)]
struct Buffer {
    chars: [[Volatile<ScreenChar>; BUFFER_WIDTH]; BUFFER_HEIGHT],
}

pub struct Writer {
    column_position: usize,
    color_code: ColorCode,
    buffer: &'static mut Buffer,
}

impl Writer {
    pub fn write_string(&mut self, s: &str) {
        for byte in s.bytes() {
            match byte {
                0x20..=0x7e | b'\n' => self.write_byte(byte),
                _ => self.write_byte(0xfe),
            }
        }
    }

    pub fn write_byte(&mut self, byte: u8) {
        match byte {
            b'\n' => self.new_line(),
            byte => {
                if self.column_position >= BUFFER_WIDTH {
                    self.new_line();
                }

                let row = BUFFER_HEIGHT - 1;
                let col = self.column_position;

                let color_code = self.color_code;
                self.buffer.chars[row][col].write(ScreenChar {
                    ascii_character: byte,
                    color_code,
                });
                self.column_position += 1;
            }
        }
    }
    fn new_line(&mut self) {}
}

use core::fmt;

impl fmt::Write for Writer {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        self.write_string(s);
        Ok(())
    }
}

pub fn print_something() {
    use core::fmt::Write;
    let mut writer = Writer {
        column_position: 0,
        color_code: ColorCode::new(Color::Yellow, Color::Black),
        buffer: unsafe { &mut *(0xb8000 as *mut Buffer) },
    };

    writer.write_byte(b'H');
    writer.write_string("ello! ");
    write!(writer, "The numbers are {} and {}", 42, 1.0/3.0).unwrap();
}

Added(2024 April 08)
Cargo.toml

[package]
name = "os"
version = "0.1.0"
edition = "2021"

[dependencies]
bootloader = "0.9"
volatile = "0.2.6"
spin = "0.5.2"


[dependencies.lazy_static]
version = "1.0"
features = ["spin_no_std"]
@phil-opp
Copy link
Owner

phil-opp commented Mar 1, 2024

Could you try checking out and running the post-03 branch of this repo? If that works, there is probably some typo in your code.

@yu1hpa
Copy link
Author

yu1hpa commented Mar 1, 2024

@phil-opp
I've tried to run that branch code, however it also doesn't work for me.

Thank you.

@phil-opp
Copy link
Owner

phil-opp commented Mar 1, 2024

Same endless boot issue?

@yu1hpa
Copy link
Author

yu1hpa commented Mar 1, 2024

Yes, same with the above video.

@phil-opp
Copy link
Owner

phil-opp commented Mar 1, 2024

That's weird. We have a CI job that tests the code regularly and it worked with the same QEMU version on macOS this night.

Could you please try running post-06 too? It has some exception handlers set up so we might see an exception message instead of the endless rebooting.

@MichaelZalla
Copy link

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

@AtomicGamer9523
Copy link

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

That's very interesting, could you please send the changes in your code that occurred when downgrading from 0.5.2 to 0.2.6.
I am looking at the diff between the two versions, and although they are different, they should optimize down to nearly the same thing.

@yu1hpa
Copy link
Author

yu1hpa commented Apr 8, 2024

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

add my Cargo.toml. (also added to the top)
volatile version is 0.2.6.

[package]
name = "os"
version = "0.1.0"
edition = "2021"

[dependencies]
bootloader = "0.9"
volatile = "0.2.6"
spin = "0.5.2"


[dependencies.lazy_static]
version = "1.0"
features = ["spin_no_std"]

And even if i run git clone <blog_os> and checkout to post-03 and build and run it, occur the same issue.
what QEMU version do you use? @MichaelZalla

@phil-opp phil-opp changed the title QUME booting in endless looping at post-03 QEMU booting in endless looping at post-03 Apr 8, 2024
@MichaelZalla
Copy link

@AtomicGamer9523 @yu1hpa

Note that the first commit is a backwards-edit, i.e., I started with finished (working) code from post-03 (in red) and un-did some things to bring the looping bug back (in green).

I'm running QEMU v8.2.91 on Windows 11:

$ qemu-system-x86_64 --version
QEMU emulator version 8.2.91 (v9.0.0-rc1-12050-gb494cb57ce)
Copyright (c) 2003-2024 Fabrice Bellard and the QEMU Project developers

rust_os_boot_qemu_looping

@AtomicGamer9523
Copy link

AtomicGamer9523 commented Apr 17, 2024

Thank you very much, @MichaelZalla. It seems that Rust's code-generator generates different assembly instructions for the two different versions of volatile.

Edit: It seems to be something to do with the introduced NonNull

Edit2: I can reproduce, will be comparing generated assembly

@AtomicGamer9523
Copy link

AtomicGamer9523 commented Apr 19, 2024

Ok, so, here's the situation, I figured out why your implementation @MichaelZalla boot-loops.

This is your main.rs:

#[panic_handler]
fn panic_handler(_info: &PanicInfo) -> ! {
    vga_buffer::print_something();

    loop {}
}

#[no_mangle]
pub extern "C" fn _start() -> ! {
    panic!("Some panic message");
}

The cycle of your program is as such:

|-> boot
|    \/
|   main
|    \/
|   your main function panics
|    \/
|   panic_handler
|    \/
|   vga_buffer::print_something
|    \/
|   death (figuring out why)
|____|

If you moved the vga_buffer::print_something function into your main, nothing breaks. Which is interesting

@UnexDev
Copy link

UnexDev commented May 11, 2024

+1 still happening (Windows).

@UnexDev
Copy link

UnexDev commented May 11, 2024

I have found the issue! It is with this line in print_something:
write!(writer, "The numbers are {} and {}", 42, 1.0 / 3.0).unwrap();

I don't know what is happening, but without that line it works.

@AtomicGamer9523
Copy link

I have found the issue! It is with this line in print_something: write!(writer, "The numbers are {} and {}", 42, 1.0 / 3.0).unwrap();

I don't know what is happening, but without that line it works.

Yes, that line will cause it, however, It seems that there is an issue with actually WRITING to volatile memory in the new version

@michihupf
Copy link

+1

But I am also experiencing the same behaviour when writing with core::ptr::write_volatile instead of relying on the volatile crate.

@michihupf
Copy link

@MichaelZalla @AtomicGamer9523 @UnexDev

While implementing the volatile writes without the use of the volatile crate I am pretty sure I figured out what's going wrong here.

Explanation

repr(transparent)
struct Buffer {
  chars: [[VolatilePtr<'static, ScreenChar>; BUFFER_SIZE_X]; BUFFER_SIZE_Y]
}

You now declared Buffer.chars to hold static pointers to ScreenChar. But we do not initialize this part of memory ourselfs and just cast 0xb8000 to be our VGA buffer:

lazy_static! {
    pub static ref WRITER: Mutex<Writer> = Mutex::new(Writer {
        column_pos: 0,
        color_code: ColorCode::new(Color::White, Color::Black),
        buffer: unsafe { &mut *(0xb8000 as *mut Buffer) },
    });
}

Therefore what lies there is our ScreenChar values. NOT pointers to them. This means ptr::write_volatile will fail as self.pointer.as_ptr() is actually not a valid pointer but ScreenChar(s) that are interpreted as a pointer as we told it to.

In my implementation I had

repr(transparent)
struct Buffer {
  chars: [[&'static mut ScreenChar; BUFFER_SIZE_X]; BUFFER_SIZE_Y]
}

which resulted in the same behavior because the actual values are interpreted as references.

Why does volatile v0.2.6 work?

The Volatile struct in version 0.2.6 of the volatile crate has the same size as its wrapped type:

/// A wrapper type around a volatile variable, which allows for volatile reads and writes
/// to the contained value. The stored type needs to be `Copy`, as volatile reads and writes
/// take and return copies of the value.
///
/// The size of this struct is the same as the size of the contained type.
#[derive(Debug)]
#[repr(transparent)]
pub struct Volatile<T: Copy>(T);

That means for this version interpreting the values in the VGA buffer as Volatile<ScreenChar> as its memory representation looks the same as ScreenChar.

Alternative using core::ptr

I removed the volatile crate and instead implemented the volatile reads and writes on Buffer.

#[repr(transparent)]
struct Buffer {
    chars: [[ScreenChar; BUFFER_SIZE_X]; BUFFER_SIZE_Y],
}

impl Buffer {
    /// Writes character `c` to `row`,`col` in the VGA buffer.
    fn write(&mut self, row: usize, col: usize, c: ScreenChar) {
        unsafe {
            // UNSAFE: all pointers in `chars` point to a valid ScreenChar in the VGA buffer.
            ptr::write_volatile(&mut self.chars[row][col], c);
        }
    }

    /// Reads a character from the VGA buffer at position `row`,`col`.
    fn read(&self, row: usize, col: usize) -> ScreenChar {
        unsafe {
            // UNSAFE: all pointers in `chars` point to a valid ScreenChar in the VGA buffer.
            ptr::read_volatile(&self.chars[row][col])
        }
    }
}

TLDR

Buffer.chars needs to hold representations of ScreenChar (that look like a plain ScreenChar), NOT pointers or references to them.
This is not an issue with the newer version of the volatile crate. To make this work with newer versions you would need to create a Volatile instance on every write or waste space by creating a second buffer with Volatile objects. Alternatively implement volatile reads/writes using core::ptr::write_volatile and core::ptr::read_volatile.

@AtomicGamer9523
Copy link

TLDR

Buffer.chars needs to hold representations of ScreenChar (that look like a plain ScreenChar), NOT pointers or references to them.
This is not an issue with the newer version of the volatile crate. To make this work with newer versions you would need to create a Volatile instance on every write or waste space by creating a second buffer with Volatile objects. Alternatively implement volatile reads/writes using core::ptr::write_volatile and core::ptr::read_volatile.

Oh my God, that makes so much sense now. Thank you for finding this, I was already starting to loose hope.

@michihupf
Copy link

Unfortunately I am not sure if this also resolves the original problem as from what I understand the boot loop occurs even with volatile v0.2.6. From what I am seeing the above code should work.

@AtomicGamer9523
Copy link

Finally have some free time on my hands. Started working on a fix / explanation here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants