I just changed a new SSD, this is a brand new one, I will monitor it closely to get more informations.
To answer your question, what do you mean by "How many VMs I use" ? Do you mean cluster wide or per node ?
On this node I had about 6 VMs but cluster wide, I have currently about 100 running VMs.
I would add that there is definitely a problem there ... I lost 2 more drives recently ... on a batch of 15 disks, I lost 8 of them ... that's far too much for a coincidence ...
As I said, I was running the exact same setup with ProxMox 3 (SSD as OS disk, SAS drives as Ceph OSDs, nothing stored...
Here is a screen capture from the SMART values taken from one of my nodes.
This node's SSD failed 2 weeks ago and I replaced it with a brand new disk 7 days ago only.
As you can see, SSD_Life_Left value is at 99 which is good (you must read it reverse as the Wearout value you find on Samsung's...
Hi,
Sorry for the provocative punchline but there is something really strange going on here ...
I'm running ProxMox 4.3 but since my migration to ProxMox 4 (I was happily running a ProxMox 3 cluster before ...), I have a very weird (and dangerous) behaviour with ProxMox installed on SSDs.
So...
* create a new VM with ID 104 on your new server with a qcow disk
* overwrite the disk in /var/lib/vz/images/104/ with your original file
* on the Web UI, go to the VM settings, the disk size is probably wrong, unplug the disk (select the disk then click on the "remove" button")
* the disk will...
Why still qemu 2.6 and not 2.7 ?
2.7 is available for nearly a month now and would (probably) fix many problems related to iothread enabled volumes such as backups and hot unplug.
VM don't need to be offline ... you can move disk from one storage to another of a running VM with no downtime ... that's black magic to me but it works like a charm !
Beware of the virtio driver version ... I had troubles with "latest" and Windows 2012R2, I had to downgrade to "stable" release (virtio-scsi 62.72.104.10200 released on 2015-03-10).
By now, my Windows 2012R2 under ProxMox 4.2 is very stable.
To be precise, this is a problem I encountered only...
Personnaly I would rather go Ceph instead of GlusterFS (even based upon ZFS) ... it features everything you need (live migration, snapshoting, thin provisionning, high availability, ...) and with ProxMox, it is rather easy to setup and well integrated.
I would also say that Ceph gives far better...
That's what it looks like but it is not ... smart state is good and since I restarted the node, for now, it is running smoothly with no errors.
As I said, I don't think this is a hardware issue but more a software issue but I don't really know where.
As I said again, I was using the same...
Well ... I would think this is somewhat related to Kernel/Ceph (maybe KRBD) because most of the time, the origin of my crashes are Ceph related.
OSDs or MONs are going down for no reasons at any moment on any node.
My Ceph storage volume is configured with KRBD and not qemu driver. I read...
Hi,
I was using ProxMox 3.4 on a 9 nodes cluster with Ceph quite nicely for more than a year and it was very stable.
I reinstalled this cluster (and even added nodes) with ProxMox 4.2 updated to current release as of today and there is not a week without a kernel panic or down OSDs or Ceph...
Had the same kind of problem when installing ProxMox on zroot.
Unless you really need ZFS as your root volume, I would suggest to install ProxMox on a minimal partition using ext4 (let says 10Gb, it's far enough even when dealing with updates) and leave empty space that you will allocate to zfs...
Hi,
It is not possible to download a container template to a GlusterFS volume even though this volume is marked as a storage for container templates.
Uploading an iso file on the other hand is working.
I get the error you can see on the screen capture.
This is still possible to use this...
Hi,
Another nasty bug with storages, if I declare a glusterfs storage on my cluster, this storage is automatically mounted in /mnt/pve/<volume name> on every nodes which is great but ... when I disable or delete this volume, the volume stays mounted and I must unmount it by hand on every nodes...
I already encountered that kind of issue and it was related to malfunctionning storage.
For example, I have a cluster with Ceph as main storage distributed all over my nodes except for one node which was not able to connect to the ceph cluster (Ceph network is 10G fiber based and this server...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.