Problems installing 5.1 on Udoo-X86's eMMC

Richard Betel

New Member
Feb 12, 2018
6
0
1
53
So I'm trying to install Proxmox on an Udoo X86.

First attempt: The installer boots, and asks for a drive to install to, and I choose "mmcblk0". Proxmox then tries to partition it, and freaks because it can't find the partitions it just created. Dropping to a shell, I see /dev/mmcblk0p1 and /dev/mmcblk0p2, so I don't know why it cannot find them. I tried looking at the installer, but it looks to me like it looks for the partitions in /sys/block, and I don't see them there.

Second attempt: I slapped an SD card into the Udoo, and tried to install again. The SD card is recognized as mmcblk1, and the exact same errors ensue.

Third attempt: I booted Debian 9, and did an install to mmcblk0. This ran fine. Awesome! Now, with a working prompt and an ssh connection, I set up apt to install proxmox, and it downloaded the packages, and then failed to configure most of them (the proxmox kernel was properly installed and boots fine). I get a series of dependencies failing,ultimately because pve-firewall failed to configure. poking around, nothing in /etc/pve got created, and if I manually run pve-firewall in /usr/sbin, it ignores absolutely all command line options, tried to open some connections to *something*, which fails, then exits. I've run dpkg --configure -a a few times, and it makes little to no difference.

Fourth attempt: I found copies of example config files for proxmox in /usr/local/doc/pve, and put them in /etc/pve/pve-firewall and /etc/pve/local . That helped not at all.


Any suggestions on what to try next? I have another install of pve on a different machine, and it runs totally fine. Could I copy /etc/pve over from there and change hostnames and IP addresses in the config files to get it working?
 
Thought I should probably post the error messages I'm seeing. First, dpkg:
Code:
root@cloud1:/etc/pve# !dpkg
dpkg --configure -a
Setting up pve-firewall (3.0-5) ...
Job for pve-firewall.service failed because the control process exited with error code.
See "systemctl status pve-firewall.service" and "journalctl -xe" for details.
dpkg: error processing package pve-firewall (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of qemu-server:
 qemu-server depends on pve-firewall; however:
  Package pve-firewall is not configured yet.

dpkg: error processing package qemu-server (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-manager:
 pve-manager depends on pve-firewall; however:
  Package pve-firewall is not configured yet.
 pve-manager depends on qemu-server (>= 1.1-1); however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-manager (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-ha-manager:
 pve-ha-manager depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-ha-manager (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-container:
 pve-container depends on pve-ha-manager; however:
  Package pve-ha-manager is not configured yet.

dpkg: error processing package pve-container (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 pve-firewall
 qemu-server
 proxmox-ve
 pve-manager
 pve-ha-manager
 pve-container


Then, trying to run pve-firewall:
Code:
root@cloud1:/etc/pve# /usr/sbin/pve-firewall help
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
 
I found a message in the proxmox mailing list that talked about a similar issue, and it said that it looks like corosync is offline. So I did this:
Code:
root@cloud1:/var/lib/corosync# systemctl status corosync
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2018-02-12 19:48:12 EST; 4min 22s ago
Condition: start condition failed at Mon 2018-02-12 19:52:30 EST; 4s ago
           └─ ConditionPathExists=/etc/corosync/corosync.conf was not met
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
  Process: 1759 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
 Main PID: 1759 (code=exited, status=8)
      CPU: 78ms

Feb 12 19:48:12 cloud1 systemd[1]: Starting Corosync Cluster Engine...
Feb 12 19:48:12 cloud1 corosync[1759]:  [MAIN  ] parser error: Missing closing brace
Feb 12 19:48:12 cloud1 corosync[1759]: error   [MAIN  ] parser error: Missing closing brace
Feb 12 19:48:12 cloud1 corosync[1759]: error   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1238.
Feb 12 19:48:12 cloud1 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Feb 12 19:48:12 cloud1 systemd[1]: Failed to start Corosync Cluster Engine.
Feb 12 19:48:12 cloud1 systemd[1]: corosync.service: Unit entered failed state.
Feb 12 19:48:12 cloud1 systemd[1]: corosync.service: Failed with result 'exit-code'.

I thought maybe I was getting somewhere, but the working proxmox machine doesn't have that file either. I tried creating an empty /etc/corosync/corosync.conf, but that didn't work:
Code:
root@cloud1:/var/lib/corosync# touch /etc/corosync/corosync.conf
root@cloud1:/var/lib/corosync# systemctl restart corosync
Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
root@cloud1:/var/lib/corosync# systemctl status corosync
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2018-02-12 19:53:13 EST; 7s ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
  Process: 1793 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
 Main PID: 1793 (code=exited, status=8)
      CPU: 94ms

Feb 12 19:53:13 cloud1 systemd[1]: Starting Corosync Cluster Engine...
Feb 12 19:53:13 cloud1 corosync[1793]:  [MAIN  ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
Feb 12 19:53:13 cloud1 corosync[1793]: notice  [MAIN  ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
Feb 12 19:53:13 cloud1 corosync[1793]: info    [MAIN  ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp pie relro bindnow
Feb 12 19:53:13 cloud1 corosync[1793]: error   [MAIN  ] Could not open /etc/corosync/authkey: No such file or directory
Feb 12 19:53:13 cloud1 corosync[1793]: error   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1302.
Feb 12 19:53:13 cloud1 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Feb 12 19:53:13 cloud1 systemd[1]: Failed to start Corosync Cluster Engine.
Feb 12 19:53:13 cloud1 systemd[1]: corosync.service: Unit entered failed state.
Feb 12 19:53:13 cloud1 systemd[1]: corosync.service: Failed with result 'exit-code'.

I think technically, this is negative progress: I've learned that more processes are not properly installed.
 
ok! some progress! pve-cluster couldn't figure out my ip address because there was a 127.0.1.1 entry in /etc/hosts. I commented that out, and pve-cluster started, and then dpkg --configure -a worked for everything else!

But it's still broken. pveproxy is running, and accepting connections, but not responding. it looks like some things are still missing from /etc/pve, eg: /etc/pve/nodes does not exist. Its on a FUSE mount, so I don't know how to fix it... yet.
 
I tried apt-get remove proxmox-ve && apt -autoremove
then I mv'ed /var/lib/pve-cluster/config.db to config.db.bak
then apt-get install proxmox-ve...

it re-installed with no errors, but config.db wasn't properly populated. /etc/pve/nodes, for example, still didn't exist, and mkdir wouldn't work.

I read that pmxcfs has a -l option, so I tried that. suddenly, I could make directories under /etc/pxe. so... I need to find the setup scripts to create .conf files, and generate keys, etc.
 
Ok, looking done. Here's the summary:

1) the proxmox installer didn't properly handle mmcblk0 partition names.
2) I screwed up during my manual install of debian, botching /etc/hosts. I should have deleted 127.0.1.1 from /etc/hosts before installing proxmox
3) config.db was not properly populated
4) I couldn't modify the config because there was no cluster quorum, even though I wasn't running a cluster. steps to fix:
5) systemctl stop pve-cluster
6) pmxctl -l
7) mkdir /etc/pve/node /etc/pve/node/<hostname> /etc/pve/node/<hostname>/priv /etc/pve/node/<hostname>/lxc /etc/pve/node/<hostname>/openvz /etc/pve/node/<hostname>/qemu-server
8) pvecm updatecerts
9) reboot

'course I haven't done anything with the machine yet, so there might still be gotcha's, but the webpage at :8006 is responding, at least.