Release Engineering

This project is intented for all release related issues (isos, installer, etc), under the umbrella of the ArchLinux Release Engineers
Tasklist

FS#17231 - Can't boot LiveCDs 2009.08 (/dev/archiso doesn't show up)

Attached to Project: Release Engineering
Opened by Heiko Baums (cyberpatrol) - Saturday, 21 November 2009, 14:55 GMT
Last edited by Dave Reisner (falconindy) - Saturday, 30 July 2011, 20:39 GMT
Task Type Bug Report
Category Hardware Issues
Status Closed
Assigned To No-one
Architecture All
Severity High
Priority Normal
Reported Version 2009.08
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

I can't boot the LiveCDs 2009.08, neither netinstall nor core.

When it tries to access /dev/archiso it shows these messages:

:: Waiting for boot device ...
Waiting 30 seconds for device /dev/archiso ...
ERROR: boot device didn't show up after 30 seconds ...
Falling back to interactive prompt
You can try to fix the problem manually, logout when you are finished
ramfs$

I can't find another error message in the boot log but it loads the initrd image /boot/archiso_pata.img.

I've got an Pioneer DVR-216D and I'm using a pure SATA system in AHCI mode without any IDE/PATA devices.
I hadn't had any problems with any previous LiveCDs.
This task depends upon

Closed by  Dave Reisner (falconindy)
Saturday, 30 July 2011, 20:39 GMT
Reason for closing:  None
Additional comments about closing:  +1 Year of inactivity. Nothing to do with archiso. Maybe this broken hardware is solved with newer Linux and Udev.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 21 November 2009, 23:01 GMT
See  FS#16374  - 2009.8 CDr does not boot properly with a SATA DVDr

What is your optical disc: CD-R/CD-RW/DVD-R/DVD+R/DVD-RW/etc...?
With what program you burned the medium cdrkit/cdrtools/cdrdao/growisofs/other ?
What are the settings about DAO(SAO)/TAO others?
Comment by Heiko Baums (cyberpatrol) - Sunday, 22 November 2009, 01:33 GMT
Optical disc: CD-R
Program: cdrskin 0.7.3 (part of libburn) and K3b 1.68.0alpha3 with wodim 1.1.9
Settings: SAO

But this shouldn't affect the boot behaviour because every other CD-R and DVD-R is working except for the latest Knoppix but this is due to missing support for pure SATA systems in Knoppix.
Comment by Gerardo Exequiel Pozzi (djgera) - Monday, 23 November 2009, 05:10 GMT
ok, I was asked this in case of problems with the media, like IO errors in  FS#16374 .

at ramfs$ prompt:
can you see some messages about DVDR on kernel dmesg?
what modules are loaded? is ahci module loaded?
Comment by Heiko Baums (cyberpatrol) - Monday, 23 November 2009, 14:19 GMT
lsmod at ramfs$ prompt showed me these modules:
sr_mod
cdrom
sd_mod
ata_generic
ohci1394
ieee1394
ahci
pata_atiixp
pata_acpi
ohci_hcd
ehci_hcd
libata
scsi_mod
usbcore

On my installed system I have some additional modules loaded which could affect this issue:
ide_pci_generic
atiixp
ide_core
sg
usb_storage

I particularly have sg in suspicion.

I couldn't find a relevant error message in dmesg as far as I could scroll back in dmesg.

There are these messages in dmesg at ramfs$ prompt:
ata3: SATA link down (SStatus 0 SControl 300)
ata3: softreset failed (device not ready)
ata3: failed due to HW bug, retry pmp=0
and 1 or 2 lines later
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
and the some lines with infos about the drive.

The same for both drives (the harddisk and the DVD writer).

On my installed system I have the same messages but one additional line between "ata3: failed ..." and "ata3: SATA link up...":
ata3: applying SB600 PMP SRST workaround and retrying

And on my installed system there are some other additional lines in dmesg:
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 1:0:0:0: Attached scsi generic sg1 type 5
sd 6:0:0:0: Attached scsi generic sg2 type 0
Comment by Gerardo Exequiel Pozzi (djgera) - Tuesday, 24 November 2009, 02:33 GMT
sg (scsi generic) is not important, used in general (attached to sr) to burn better with cdrtools (not cdrkit).

the correct module is loaded (sr_mod), you need to view something like this:
...
ataN.nn: ATAPI: "DVD SIGNATURE"
...
scsi 1:0:0:0 CD-ROM ....
...
sr0: scsi3-mmc drive: .....
...
sr: 1:0:0:0 Attached scsi CD-ROM sr0
...

any posibility in capturing the dmesg and attach here? save to usb drive or floppy?

PS: Is guess you intent to say ata_piix not atiixp
Comment by Heiko Baums (cyberpatrol) - Tuesday, 24 November 2009, 04:45 GMT
> PS: Is guess you intent to say ata_piix not atiixp

No, I mean atiixp. I have both modules atiixp and pata_atiixp loaded on my system.
I don't have ata_piix.

I've got an AMD Athlon 64 X2 6000 on a Gigabyte GA-MA770-DS3. North Bridge is AMD 770 and South Bridge is AMD SB700. Just in case.

I'll try to capture dmesg later.
Comment by Heiko Baums (cyberpatrol) - Tuesday, 24 November 2009, 14:11 GMT
Here are dmesg and lsmod of my installed system, LiveCD 2009.02 and LiveCD 2009.08.

And I still have sg in suspcion. I seem to remember that I once compiled a kernel on Gentoo without sg (SCSI general) because the documentation said it is not necessary and I couldn't access either my DVD writer (was an IDE drive) or my USB stick or something else. After compiling sg into the kernel it worked. So sg is documented as not necessary if other SCSI modules are compiled into the kernel or as module but in fact sg is important. And sg is automatically loaded on my installed system and on the LiveCD 2009.02. So I guess there is a reason for this.

But there are much fewer modules loaded in LiveCD 2009.08 than on my installed system and on LiveCD 2009.02. Maybe there's another module missing.
Comment by Gerardo Exequiel Pozzi (djgera) - Tuesday, 24 November 2009, 14:37 GMT
In both dmesg I can see the sr0 device that is the important device.
Is this device created on /dev ? in case that is available: is accesible (mount for example)? -> try to make a manual symlink named /dev/archiso -> /dev/sr0 and exit from the shell.
Another test is accessing via sg (modprobe sg) and /dev/sgN, see what is your dvd unit

Fewer modules because 2009.02 is beyond initrd. If you boot with break=y you will see ~= amount of loaded modules.

PS: Ignore the error (both kernels display the same msg): "ata2: softreset failed (device not ready), ata2: failed due to HW bug, retry pmp=0"
Comment by Heiko Baums (cyberpatrol) - Tuesday, 24 November 2009, 16:10 GMT
Problem found. The link /dev/archiso wasn't created.

After
ramfs$ cd /dev
ramfs$ ln -s sr0 archiso
ramfs$ exit
it continued booting correctly.

And sg was also loaded after this. ;-)

Here's the new dmesg and lsmod, just in case.
Comment by Gerardo Exequiel Pozzi (djgera) - Tuesday, 24 November 2009, 23:36 GMT
Excelent!. The mistery now is why the udev rule that create the symlink does not work.
Will be very useful if can track it.

in the ramfs$ prompt:
(1) The rule is created at /lib/udev/rules.d/00-archiso-device.rules by he hook "archiso-early". Ok now in this file can you see ARCHISO_AHCOHH6O in the last line?
(2) If you execute: udevadm trigger the symlink is created?

Thanks.
Comment by Heiko Baums (cyberpatrol) - Wednesday, 25 November 2009, 06:32 GMT
(1) /lib/udev/rules.d/00-archiso-device.rules is attached. The file /lib/initcpio/hooks/archiso-early doesn't exist.
(2) udevadm trigger creates the symlink /dev/archiso,
Comment by Gerardo Exequiel Pozzi (djgera) - Wednesday, 25 November 2009, 14:46 GMT
(1) OK. No, hooks under initrd is under /hooks/ ;) And the 00-archiso-device.rules only exists if there is archiso-early executed.
(2) Good!. But for some reason, udev that is executed just after archiso-early hook, does not find/not execute this hook.
Comment by Heiko Baums (cyberpatrol) - Thursday, 26 November 2009, 11:25 GMT
(1) archiso-early is indeed there. I've attached it.
(2) The problem is, that there's no ls command at the ramfs$ prompt so I can't check e.g. the file permissions, look for files etc. Finding and copying the other files was just knowing or guessing the path and executing a cat <filename>.
Comment by Heiko Baums (cyberpatrol) - Thursday, 26 November 2009, 12:01 GMT
Can it be that vol_id is missing on the LiveCD?
Comment by Heiko Baums (cyberpatrol) - Thursday, 26 November 2009, 15:36 GMT
On the other hand, if vol_id was missing on the LiveCD, even if I couldn't find it on ramfs$ prompt, it couldn't have been possible to create /dev/archiso by udevadm trigger.
Comment by Gerardo Exequiel Pozzi (djgera) - Thursday, 26 November 2009, 23:33 GMT
there is no ls, yes, but can use echo * ;) vol_id is under /lib/udev/

ramfs$ /lib/udev/vol_id --export /dev/sr0
ID_FS_USAGE=filesystem
ID_FS_TYPE=iso9660
ID_FS_VERSION=
ID_FS_UUID=
ID_FS_UUID_ENC=
ID_FS_LABEL=ARCHISO_AHCOHH6O
ID_FS_LABEL_ENC=ARCHISO_AHCOHH6O

But your problem is weird: this is, on the first call to udevadm trigger executed by [udev] hook just after [archiso-early], appears like udev can not find the rule generated by [archiso-early]. But when you call from command line, this rule is executed OK.

Looks like (weird/hypothetical idea) a race condition, as if this rule is not fully written on the ramdisk at the time that udev is executed.

The only problem is that there is no way to pass arguments to udev, so can debug this. The only way is if you rebuild the iso modifiying the hooks. (and maybe with new kernel works fine as-is)
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 04:48 GMT
In the meantime I rebooted the LiveCD many times, extracted and searched the initrd archiso_pata.img and I found out some things.

Maybe I should have written this in the initial report but I found out that this could be important only after doing the testing.
Before these messages:
:: Waiting for boot device ...
Waiting 30 seconds for device /dev/archiso ...
ERROR: boot device didn't show up after 30 seconds ...
Falling back to interactive prompt
You can try to fix the problem manually, logout when you are finished
ramfs$

There are *only* two other messages:
:: Running Hook [archiso]
:: Mounting tmpfs, size=75%...done
:: Waiting for boot device...

In the script init there's this paragraph:

if [ -e "/hooks" ]; then
for h in ${HOOKS}; do
TST=""
eval "TST=\$hook_${h}"
if [ "${TST}" != "disabled" ]; then
run_hook () { msg "${h}: no run function defined"; }
if [ -e "/hooks/${h}" ]; then
. /hooks/${h}
msg ":: Running Hook [${h}]"
run_hook
fi
fi
done
fi

So before the message
:: Running Hook [archiso]
there should have been the messages
:: Running Hook [base]
:: Running Hook [archiso-early]
:: Running Hook [udev]
which are missing.

Nevertheless these hooks seem to be run anyway.

But I'm wondering what
eval "TST=\$hook_${h}"
and
if [ "${TST}" != "disabled" ]; then
are doing.

Is there a possible bug?

Another idea: The problem could probably be fixed by simply renaming 00-archiso-device.rules to 99-archiso-device.rules. Udev probably doesn't know /dev/sr0 when it triggers 00-archiso-device.rules but after it had triggered the other rules. This could explain why it creates /dev/archiso when udevadm trigger is run manually at the ramfs$ prompt because this time it knows /dev/sr0 from the first udevadm trigger in the udev hook.

Just another question: When and where is $tempnode set (to /dev/sr0)?

Another guess is that it's probably just a timing issue, so that inserting a sleep for 1 or 2 seconds between udev -d and udevadm trigger in the udev hook could fix this.

Otherwise with /sbin/udevd --daemon --debug in the udev hook instead of /sbin/udevd --daemon we could get a debug output of udev.

But my favourite thought is the renaming of 00-archiso-device.rules to 99-archiso-device.rules.

Btw., in the meantime I've tested the LiveCD 2009.08 on two other computers, a desktop with an AMD Athlon 64 X2 with an SATA harddisk and two IDE DVD writer and a notebook of which I don't know the hardware details yet. On the desktop it booted without any problems. On the notebook it also didn't boot but there it stuck when starting udev at the first runlevel after the initrd. I've got these message:
"Waiting for UDev uevents to be processed [BUSY]"

I hadn't had time to look deeper into this. I think I'll borrow the notebook for a few days next week.

The latter is probably a different issue but maybe it's a bug in the udev version on the LiveCD and just updating the packages on the LiveCD could fix it.

The problem is that I don't know how to rebuild the iso or build an Arch based iso generally.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 05:28 GMT
Problem found!
It's indeed the file name 00-archiso-device.rules.

I copied the file 00-archiso-device.rules to /lib/udev/rules.d of my installed system on my harddisk and rebooted my PC. I inserted the LiveCD when the grub menu of my harddisk was loaded because of the ID_FS_LABEL_ENC. And there was no link /dev/archiso.

Then I renamed /lib/udev/rules.d/00-archiso-device.rules on my harddisk to 99-archiso-device.rules. And guess what? After rebooting the link /dev/archiso was there and pointed to /dev/sr0.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 28 November 2009, 16:44 GMT
interesting why no hooks executed. This is how boot under normal behaviour:
:: Loading Initramfs
:: Running Hook [archiso-early]
:: Running Hook [udev]
:: Loading udev...
@@@ messages of detected devices @@@
done.
:: Running Hook [archiso]
:: Mounting tmpfs, size=75%...done.
:: Waiting for boot device...
Waiting 30 seconds for device /dev/archiso ...
SUCCESS: Mounted archiso volume successfully.
@@@ rest of messages about mounting images @@@
:: Passing control to Arch Linux Initscripts...Please Wait
INIT: version 2.86 booting

about loop: is for see if the hook is disabled (see before some exports) via kernel command line.

> Just another question: When and where is $tempnode set (to /dev/sr0)?
In the real scenario is pointed to another path (temporal), just before /dev/sr0 is created. This is managed by udevd daemon.

Confused! For the questions: you are saying that no see other messages, so udev is not executed -> in this case renaming the rule 00 -> 99, no effects, and no timming effects. right? Again are you sure that [udev] hook is not executed? Acording to to second message I guess that is executed.

About udev [BUSY] for ever only god knows!
About the renaming of 00 -> 99 is interesting, but without debug, only suppositions I can say :(

For recreating the iso, if you need help please let me know.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 28 November 2009, 17:27 GMT
pacman -S --needed git cdrkit squashfs-tools devtools pwgen
cd /tmp # or to directory with at least 1G of free space.
git clone git://projects.archlinux.org/archiso.git
cd archiso/archiso
sed -i 's|00|99|' hooks/archiso-early (first try to make without this change)
make install
cd ../configs/install-iso
make net-iso # this will take about 3 minutes or more depending what packages you need to download if not already in pacman cache.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 17:30 GMT
I booted the LiveCD again and watched the output more exactly.

All these messages
:: Loading Initramfs
:: Running Hook [archiso-early]
:: Running Hook [udev]
:: Loading udev...
are echoed. The output just scrolled too fast and there were too many other outputs so that I couldn't scroll back enough in the screen buffer (Shift+PageUp). So I just missed these messages.

Btw., if these hooks weren't executed then /lib/udev/rules.d/00-archiso-device.rules couldn't have been created.

I would suggest we forget the udev [BUSY] on the notebook for now until this device link bug is fixed. Maybe it's just an aftereffect. I'll file a new bug report when I have more information about the hardware of the notebook.

Regarding the renaming of 00 -> 99 there's no more testing necessary. You can simply rename it in the archiso-early hook and rebuild and upload the new iso.

You can `grep sr /lib/udev/rules.d/*`. You will see that the sr* devices are created by the rules in 60-persistent-storage.rules. At least they are not created when 00-archiso-device.rules is triggered. `udevadm trigger` triggers the rules in alphabetically order. So it first triggers 00-archiso-device.rules and only later 60-persistent-storage.rules and the other rules. This means that there are no sr* devices when it triggers 00-archiso-device.rules. So it of course can't create the archiso link. The sr* devices are only created later when it triggers 60-persistent-storage.rules. When you now run udevadm trigger manually from ramfs$ prompt then the sr* devices are of course there because they were created by the first `udevadm trigger` within the udev hook. That's why the second manual `udevadm trigger` can create the archiso link. In my case I think it should also be sufficient to rename 00-archiso-device.rules to 61-archis-device.rules, but I think it's better to rename it to 99-... to be certain that this rule is the last rule that is triggered and that every other device is created before.

I already tested it on my locally installed system and it worked doubtless. Even better as soon as I removed the LiveCD the /dev/archiso link was removed automatically, too, and was recreated automatically as soon as I inserted the LiveCD again.

So you can indeed just rename the file and rebuild the iso without a debug. It's no supposition, it's knowledge, logic and testing. ;-) Believe me.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 17:36 GMT
I'll rebuild the iso and test it but I'm absolutely sure that I'm right. ;-)
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 28 November 2009, 17:42 GMT
you can create symlink to inexistent files ;) and again rember that $tempnode not contains /dev/sr0 was an example only.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 19:30 GMT
I can, but I don't think that udev can and/or does.
Creating the isos didn't work. When booting these CDs I get a kernel panic: no init found.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 28 November 2009, 23:14 GMT
mmm, ok, for make the isos is better to do on clean chroot with the {base} group and only the needed packages listed above. (or can attach, the output log of the make here) ;)
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 23:33 GMT
Ok, then, please, tell me how do I get a clean chroot with the base group. Or lets say, how do I get the base group installed in a directory other than / ? Pacman usually installs in / . Is this effort really necessary? I can tell you the result anyway.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 23:36 GMT
Btw., I have the impression that creating the iso has written some things into my / directory structure where it doesn't belong. But I can't tell what and where.
Comment by Heiko Baums (cyberpatrol) - Saturday, 28 November 2009, 23:44 GMT
Tried the following:
mkdir /mnt/buildarchiso
pacman -b /var/lib/pacman -r /mnt/buildarchiso -S base

This wants to reinstall the whole base group and tells me that dcron conflicts with fcron and wants to uninstall fcron. So it seems that pacman wants to install the packages from base group to / and not to /mnt/buildarchiso.
Comment by Gerardo Exequiel Pozzi (djgera) - Sunday, 29 November 2009, 00:00 GMT
no, for making chroot use:
# mkarchroot /tmp/coco base base-devel git cdrkit squashfs-tools devtools pwgen
then enter with:
# mkarchroot -r /bin/bash /tmp/coco
edit the mirrorlist for pacman
then do step for making the iso.

This will take 1.4G of space.
Comment by Heiko Baums (cyberpatrol) - Sunday, 29 November 2009, 03:11 GMT
Good news: Creating the iso worked. But you have forgotten make in the mkarchroot command. ;-)
Bad news: On my harddisk renaming from 00 to 99 helped, on the LiveCD not. And udevd --daemon --debug doesn't give any helpful output and udevadm --verbose --debug trigger doesn't do anything. But when I then execute udevadm trigger manually /dev/archiso is still not created, only after the second manually executed udevadm trigger.
Comment by Heiko Baums (cyberpatrol) - Sunday, 29 November 2009, 03:12 GMT
And thanks for the explanations how to create the iso!
Comment by Sven-Hendrik Haase (Svenstaro) - Saturday, 05 December 2009, 19:15 GMT
New info on creating iso is always updated by me in this article: http://wiki.archlinux.org/index.php/Archiso
djgera posted old info as I changed the required bootloader in archiso just a day after that, assuming you use git.
Comment by Gerardo Exequiel Pozzi (djgera) - Thursday, 04 March 2010, 04:38 GMT
Please test the latest iso builds at http://build.archlinux.org/isos/
Comment by Heiko Baums (cyberpatrol) - Thursday, 04 March 2010, 22:23 GMT
The bug is still there in archlinux-2010.03.02-netinstall-i686.iso.
It's exactly the same. /dev/archiso is not created and booting ends in a ramfs$ prompt. Only running udevadm trigger at the ramfs$ prompt creates /dev/archiso.
Comment by Gerardo Exequiel Pozzi (djgera) - Thursday, 04 March 2010, 22:54 GMT
weird behavior, so in resume: you need a second udevadm trigger...

when [udev] hook is executed: do you see some message about your dvdrom drive?

at ramfs$ prompt, do you see /dev/sr0? If this is true, in new archiso, there is a command line parameter archisodevice=, so you can set to /dev/sr0, please try this.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 01:27 GMT
In the meantime I don't believe that I wouldn't need a second udevadm trigger, because I noticed that the [archiso] is executed before the [udev] hook is completely executed, because there are messages which belong to the [udev] hook displayed after starting the [archiso] hook. So probably the [archiso] hook just needs to wait for the [udev] hook to be completed and be started later.

I see some messages about my DVD-ROM drive when the [udev] hook is executed. The drive is detected correctly. Tell me, if you need the output, then I'll post it here.

I also see /dev/sr0 at the ramfs$ prompt.

Starting with the kernel parameter archisodevice=/dev/sr0 completely doesn't work.
With this parameter I get these messages during the execution of the [archiso] hook:

:: Running Hook [archiso]
:: Mounting tmpfs, size=75% ... done.
:: Waiting 30 seconds for device /dev/sr0 ...
ERROR: /dev/sr0 found, but the filesystem type is unknown.
Falling back to interactive prompt
You can try to fix the problem manually, log out when you are finished
/bin/sh: can't access tty; job control turned off
[ramfs /]#

Between these messages there are again some messages from the [udev] hook.

If I now enter udevadm trigger and exit at this prompt, I get a kernel panic:
Kernel panic - not syncing
Attempt to kill init!
Comment by Gerardo Exequiel Pozzi (djgera) - Friday, 05 March 2010, 02:10 GMT
>>> I noticed that the [archiso] is executed before the [udev] hook is completely executed, because there are messages which belong to the [udev] hook displayed after starting the [archiso] hook. So probably the [archiso] hook just needs to wait for the [udev] hook to be completed and be started later.

mmm what messages?. [udev] hook is finished then [archiso] is executed. At least "udevadm settle" will wait for all uevents are finished. (this is what say the documentation)

>>> ERROR: /dev/sr0 found, but the filesystem type is unknown.

just to be sure: you obtain the same error for /dev/archiso?

Please boot with break=y option and see the output of blkid /dev/sr0, should be /dev/sr0: LABEL="ARCH_201003" TYPE="udf", and/or use the symlink.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 03:27 GMT
> just to be sure: you obtain the same error for /dev/archiso?

No, only for /dev/sr0, and only if I boot with the kernel parameter archisodevice=/dev/sr0.

> mmm what messages?

Messages about setting up SCSI/SATA devices like these:
scsi 8:0:0:0: Direct-Access Generic USB SD Reader 1.00 PQ: 0 ANSI: 0
sd 8:0:0:0: Attached scsi generic sg2 type 0
etc.

They definitely belong to the [udev] hook, because similar messages are shown, when the hard disk and the DVD writer are set up, before the [archiso] hook is executed. And these messages are still shown after the [archiso] hook is executed and are mixed up with the [archiso] messages. And they are shown after the [ramfs /]# prompt is shown.

> [udev] hook is finished then [archiso] is executed. At least "udevadm settle" will wait for all uevents are finished. (this is what say the documentation)

Then this is either a bug in udev or in the udev documentation, most likely the first.

> Please boot with break=y option and see the output of blkid /dev/sr0, should be /dev/sr0: LABEL="ARCH_201003" TYPE="udf", and/or use the symlink.

This doesn't help. It only stops when the [archiso] hook is executed and gives these messages:

:: Running Hook [archiso]
:: Break requested, type 'exit' for resume operation
/bin/sh: can't access tty; job control turned off
[ramfs /#] scsi 8:0:0:0 Direct-Access Generic USB SD Reader 1.00 PQ: 0 ANSI: 0
sd 8:0:0:0: Attached scsi generic sg2 type 0
<etc.>
<And after pressing Enter:>
[ramfs /]#

If I now enter 'exit' then I get the messages about the missing /dev/archiso and the /bin/sh error message.
Comment by Gerardo Exequiel Pozzi (djgera) - Friday, 05 March 2010, 03:48 GMT
Maybe waiting for uevent queue empty does not imply that the device is ready... (I begin to doubt, I never had the opportunity to experience this kind of problems to speak of personal experience and I'm just guessing.)

>> This doesn't help. It only stops when the [archiso] hook is executed and gives these messages:
Yes, thats is the idea, stoping just before mount hook is executed, so can view the state of the system.

Please execute blkid /dev/sr0 at prompt, I want to see the output, if nothing is output wait a few seconds.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 11:19 GMT
The ouput of "blkid /dev/sr0" is "/dev/sr0: LABEL="ARCH_201003" TYPE="udf"".
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 05 March 2010, 11:22 GMT
Both x86_64 images built from current git master (not the pre-baked isos you provide for download though) work fine for me in both VBox and QEMU-KVM.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 11:41 GMT
Where can I download the images?
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 05 March 2010, 11:58 GMT
Gerado posted a link in the comments here earlier: http://build.archlinux.org/isos/
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 12:10 GMT
Sounded like you had built different isos and have a different download link.
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 05 March 2010, 12:11 GMT
No, I built my own isos from the current git master. I cannot provide download links due to my slow connection.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 12:15 GMT
That's obvious. ;-)
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 05 March 2010, 19:10 GMT
I'd just like to add that I also tested on a real USB key and booting from that works awesome as well. I even got a persistent partition on it.
Comment by Heiko Baums (cyberpatrol) - Friday, 05 March 2010, 22:04 GMT
Dieter, for which or whose response are you waiting?

Sven, do you've got an SATA only system, means only with SATA devices?
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 05 March 2010, 23:56 GMT
I just tested a virtual mixed, IDE-only and SCSI-only system. All working. The real machine I booted on only has SCSI (SATA) devices.
Comment by Heiko Baums (cyberpatrol) - Saturday, 06 March 2010, 00:39 GMT
I have no idea. I think we should assume that there's an upstream bug in udev, either in udevd or in udevadm (settle).
I have looked at the initscripts and hooks of the CD earlier and found no bugs and except of the archiso related hooks no essential differences to my installed system which works perfectly. But I couldn't see anything wrong with these hooks.

What I can think of is an issue with udevd or udevadm settle at least if the system is booted from a DVD writer or my DVD writer, a Pioneer DVR-216D.

Also that some of the udevd messages are shown after the [archiso] hook is started though udevadm settle which should wait until every device is set up by udevd is executed, speaks for an upstream udev bug. Unless you have another idea what can be done to fix this bug, I think, we should file a bug report to udev upstream. Tell me, if you want to file this bug report due to a most likely better udev knowledge than mine or if I shall do it.

At least udev upstream probably has an idea what the reason can be if it's not an udev bug.
Comment by Heiko Baums (cyberpatrol) - Saturday, 06 March 2010, 00:50 GMT
Comment deleted, because I've accidentally sent the incomplete comment which is now edited.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 06 March 2010, 01:14 GMT
As as said before "Maybe waiting for uevent queue (udevadm settle) empty does not imply that the device is ready...". OK?

In other words, device node sr0 is created, but the hardware device is not ready. Since archiso loop just check about /dev/archiso is ready and does not check that hardware device is ready, execution continue, then blkid fails.

I guess I can add without any issues another loop for your case in blkid. I can do it, if you are interested and build your iso. OK?
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 06 March 2010, 02:08 GMT
OK, here is the patch, please test it :)

:: Waiting for boot device...
Waiting 30 seconds for device /dev/archiso ...
Waiting 30 seconds for device filesystem /dev/archiso ...
UDF-fs: Partition marked readonly; forcing readonly mount

pacman -S git squashfs-tools syslinux devtools cdrkit make mkinitcpio --needed
git clone git://github.com/djgera/archiso -b djgera
cd archiso
git am < /path/to/0007-Add-another-device-test-if-become-ready.patch
cd archiso
make install
cd ../configs/syslinux-iso/
make net-iso
Comment by Heiko Baums (cyberpatrol) - Saturday, 06 March 2010, 19:37 GMT
This patch doesn't help. Still the same, only that it now says "ERROR: boot device didn't show up after 60 seconds..." instead of "ERROR: boot device didn't show up after 30 seconds...".
Comment by Heiko Baums (cyberpatrol) - Saturday, 06 March 2010, 19:46 GMT
> In other words, device node sr0 is created, but the hardware device is not ready. Since archiso loop just check about /dev/archiso is ready and does not check that hardware device is ready, execution continue, then blkid fails.

It shouldn't matter, btw., if the device /dev/sr0 is ready, it should be sufficient if the device exists, because for creating the symlink only the device node is needed, not the hardware, and the device node /dev/sr0 is existing. And why should the DVD writer not be ready? I mean the CD is already in the drive and was already read, because booting was already started from the CD. So the drive must already be initialized.

The device not being ready should bring up a different error message like the one I get if I boot with the parameter archisodevice=/dev/sr0. But the link /dev/archiso should be created nevertheless.
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 06 March 2010, 20:37 GMT
grrrr! absolute confused!

>>> And why should the DVD writer not be ready? I mean the CD is already in the drive and was already read, because booting was already started from the CD. So the drive must already be initialized.

Because...use different methods: When boot cdrom is in "legacy" mode no driver is needed. When driver is loaded, cdrom is re-setup (some time is needed to become ready again), and can access to it.

Previuosly you can access from prompt with blkid and get information from the filesystem... Now there is another wait, not only for node ready, also for filesystem on device ready. Confused!
What you get, if exit from that shell? What remains is to test command after command by hand...

boot with: break=y disablehooks=udev,archiso

at prompt:
/sbin/udevadm trigger ; /sbin/udevadm settle ; blkid /dev/sr0
Comment by Heiko Baums (cyberpatrol) - Sunday, 07 March 2010, 17:42 GMT
Well, this seems to become weirder. This seems to be two issue, the original one and one with udevadm trigger or settle.

If I boot with break=y disablehooks=archiso_early,udev,archiso and enter /sbin/udevadm trigger ; /sbin/udevadm settle ; blkid /dev/sr0 at the prompt, I get the usual udev ouput, within this output the [ramfs /]# prompt, but no blkid output:

...
sr 1:0:0:0: Attached scsi generic sg1 type 5
sr0: scsi3-mmc drive: 12x/12x writer cd/rw xa/form 2 cdda tray
Uniform CD-ROM driver Revision: 3.20
[ramfs /]# scsi 6:0:0:0: Direct-Access Generic USB SD Reader 1.00 PQ: 0
ANSI: 0
sd 6:0:0:0 Attached scsi generic sg2 type 0
scsi 6:0:0:1: Direct-Access Generic USB CF Reader 1.01 PQ: 0 ANSI: 0
sd 6:0:0:1: Attached scsi generic sg3 type 0
...
sd 6:0:0:0: [sdb] Attached SCSI removable
sd 6:0:0:1: [sdc] Attached SCSI removable
sd 6:0:0:2: [sdd] Attached SCSI removable
sd 6:0:0:3: [sde] Attached SCSI removable

In the next line there's only the cursor. If I press Enter I get the [ramfs /]# prompt back. Well the prompt was already there, but between the whole udev output.

Now I found out that the udev output which was shown after the ramfs prompt was in all cases related to my internal USB multi card reader (SD, CF, etc.). So I unplugged it and tried it again.

If I now boot with break=y disablehooks=archiso_early,udev,archiso and enter /sbin/udevadm trigger ; /sbin/udevadm settle ; blkid /dev/sr0 at the prompt, I don't get the whole udev output anymore, but the blkid output and the ramfs prompt:

/dev/sr0: LABEL="ARCH_201003" TYPE="udf"
[ramfs /]#

But if I boot the CD normally with the USB card reader disconnected I still get the same /dev/archiso bug.

So the original bug still exists, and additional there's a bug in udevadm trigger or settle, which somehow doesn't seem to be able to handle the internal USB multi card reader which is also accessed by the SCSI system and which is simply connected to a USB connector on the mainboard.
Comment by Heiko Baums (cyberpatrol) - Sunday, 07 March 2010, 18:02 GMT
I tried it again with break=y disablehooks=udev,archiso, but the same except for one or two differences. If I disconnect the USB card reader this time I get again the whole udev output, of course except for the messages related to the card reader. And the blkid output was missing in both cases, with the card reader connected and disconnected.

Loading...