MCORD Installation Issues


#1

Hi All,

Reference Link :- https://guide.opencord.org/profiles/mcord/install.html

Server Details :-
RAM - 96 GB
CPUs - 2 X Octacore Intel® Xeon® CPU E5-2690 0 @ 2.90GHz (32 vCPUs)
OS - Ubuntu 16.04

  1. On one server in our lab, I tried to configure mcord using the convenient script “mcord-in-a-box.sh”.

  2. Actually, the script output is displayed on the console. After the script execution completed, based on the available console output, it seemed that there were no issues.

Note:- But, looks like when Jimmy executed the same script on one of his server, he saw some issues in the console output.

  1. Post convenient script execution, I had to perform few validation checks and I observed the below issues.

3.1. VTN nodes initializing command fails. Please see below itself.

biarca@ubuntu:~/cord/helm-charts$ ssh -p 8101 onos@onos-cord-ssh.default.svc.cluster.local cordvtn-nodes
Password authentication
Password:
Command not found: cordvtn-nodes

3.2. I see only one “Cirros 0.3.5 64-bit” image when I execute “openstack image list”. Where there should be more.

3.3. Bad request error when trying to create vepc service for mcord. Below is the output.

http -a admin@opencord.org:letmein POST http://xos-gui.default.svc.cluster.local:4000/xosapi/v1/vepc/vepcserviceinstances blueprint=mcord_5 site_id=1

HTTP/1.1 400 Bad Request
Connection: keep-alive
Content-Type: text/html
Date: Wed, 01 Aug 2018 13:13:16 GMT
Server: nginx/1.13.12
Transfer-Encoding: chunked

{“fields”: {}, “specific_error”: “Cannot find eligible owner of class VEPCService”, “error”: “XOSValidationError”}

There is not much help when I try to google on the above error. Also, I am not seeing any logs on the “xos-gui” and “mcord-epc-service” pods when I execute the above command.

3.4. Only management network is displayed when I execute “openstack network list”.

3.5. Nothing is displayed when I execute “openstack server list --all-projects”.

3.6. I observed issues while accessing UI. Then I raised the same with cordev community. One Mr. Andy provided a workaround and after following it, I could access UI.

3.7. Will continue to work on the issues and keep you all updated with the progress.

  • Manoj

#2

Today, I performed mcord installation on a fresh setup again and I too observed the “containers failed to start.” message. Upon debugging, I figured out that as part of convenient script execution, a job named mcord-tosca-loader is started, but it failed. Actually, the job is creating a pod with naming convention “mcord-tosca-loader-”. I started a new mcord-tosca-loader job and it was successful. Below is the output.

biarca@ubuntu:~$ kubectl get jobs
NAME DESIRED SUCCESSFUL AGE
base-openstack-tosca-loader 1 1 4h
mcord-mcord-subscriber 1 1 4h
mcord-tosca-loader 1 0 4h
mcord-tosca-loader-test 1 1 1h ----- I created it
mcord-tosca-loader-test2 1 1 16m ----- I created it

Even after this, I continue to see the validation errors, which I mentioned earlier.

Best Regards,
Manoj


#3

I raised the issue with Cord-dev community on Friday (3rd Aug 2018) and below is the link.

https://groups.google.com/a/opencord.org/forum/#!topic/cord-dev/oJEvOYTyJho

  • Manoj

#4

On my single node m-cord cluster, I currently have the below issues.

  1. When I execute the below command, I see a message “Command not found cordvtn-nodes”
    ssh -p 8101 onos@onos-cord-ssh.default.svc.cluster.local cordvtn-nodes

  2. In openstack, I see only 2 networks - management & sgi_network. Where as the below should be displayed.

    s11_network
    management
    s6a_network
    spgw_network
    flat_network_s1u
    db_network
    sgi_network
    flat_network_s1mme

  3. No servers (instances or VMs) were created in Openstack. Where as the below 5 should have been created.

    mysite_vmme-2
    mysite_vspgwu-1
    mysite_hssdb-5
    mysite_vhss-4
    mysite_vspgwc-3

The issue seems to be with the Openstack helm configuration. We will dig more in that angle and update here.

  1. In the UI, now I see lots of information, which also includes ServiceInstance notifications (both successful and failures).

  2. Had a call with Jimmy yesterday and showed him the current state of the set.

Thanks & Regards,
Manoj


#5

The issue seems to be with openstack-helm. To root cause the issue, we tried just openstack-helm installation (from official openstack site also) on a VM & also on a baremetal and we see the below issues, which we are seeing on the m-cord server also.

  1. The host on which QEMU was configured is not listed. Below is the output.

    biarca@ubuntu:~/cord$ openstack hypervisor list
    ±—±--------------------±----------------±--------±------+
    | ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |
    ±—±--------------------±----------------±--------±------+
    | 1 | ubuntu | QEMU | None | up |
    ±—±--------------------±----------------±--------±------+

  2. When we try to create an instance, it fails with the error “No valid host was found. There are not enough hosts available.”.

After good amount of googling, I posted the issue in CORD’s slack and #openstack-helm slack. One person responded in CORD slack saying he is also seeing the same issue and looking for help. No responses in the #openstack-helm slack yet.

Thanks & Regards,
Manoj


#6

We are also pursuing OpenStack Ansible and I am sure we will have better success since this is a “known beast” :smile: Once OpenStack Ansible is operational we will look into the Integration aspect.


#7

All,

Today on one of the local server in my lab, the M-CORD convenient script execution was successful !! For the first time in the last couple of weeks, I was able to access M-CORD UI without issues.

But, I am no completely done yet. Post above mentioned script execution, there were few validation checks that needs to be done. As part of the same, I did not see any issues WRT M-cord checks. But, WRT Openstack checks, though I do not see issues, I see the number of items listed were few. For example, I see the below networks in openstack…

openstack network list
±-------------------------------------±-----------±-------------------------------------+
| ID | Name | Subnets |
±-------------------------------------±-----------±-------------------------------------+
| 77051599-afed-4d00-b5ed-9a2d6f94b121 | management | 5053ebbc-3050-490d-8b55-3b96cd9e68d9 |
±-------------------------------------±-----------±-------------------------------------+

Where as, looks like the list should have the below…
±-------------------------------------±-------------------±-------------------------------------+
| ID | Name | Subnets |
±-------------------------------------±-------------------±-------------------------------------+
| 0bc8cb20-b8c7-474c-a14d-22cc4c49cde7 | s11_network | da782aac-137a-45ae-86ee-09a06c9f3e56 |
| 5491d2fe-dcab-4276-bc1a-9ab3c9ae5275 | management | 4037798c-fd95-4c7b-baf2-320237b83cce |
| 65f16a5c-f1aa-45d9-a73f-9d25fe366ec6 | s6a_network | f5804cba-7956-40d8-a015-da566604d0db |
| 6ce9c7e9-19b4-45fd-8e23-8c55ad84a7d7 | spgw_network | 699829e1-4e67-46a7-af2d-c1fc72ba988e |
| 87ffaaa3-e2a9-4546-80fa-487a256781a4 | flat_network_s1u | 288d6a8c-8737-4e0e-9472-c869ba3e7c92 |
| 8ec59660-4751-48de-b4a3-871f4ff34d81 | db_network | 6f14b420-0952-4292-a9f2-cfc8b2d6938e |
| d63d3490-b527-4a99-ad43-d69412b315b9 | sgi_network | b445d554-1a47-4f3b-a46d-1e15a01731c0 |
| dac99c3e-3374-4b02-93a8-994d025993eb | flat_network_s1mme | 32dd201c-8f7f-4e11-8c42-4f05734f716a |
±-------------------------------------±-------------------±-------------------------------------+

May be because of this issue, I see some synchronization failure for some instances in the M-CORD UI. Will continue to work on the same.

Thanks & Regards,
Manoj


#8

Hi All,
I have been working on getting MCORD to run on Arm without openstack. To do this I’m starting by putting up the helm charts for every non-Openstack component to then check for functionality and figure out what holes remain. While doing this I noticed some strange errors coming from the tosca-loader component. The error is different whether I bring up the Helm charts from the tip of master or CORD 6.0 (the latest stable release) but in both cases it complains about being passed in a YAML configuration file. When bringing up the charts for CORD 6.0 the error is the following: Model of class Network and properties {'name': 'management'} has property 'must-exist' but cannot be found. These errors are tricky to catch because eventually kubernetes gives up trying to get tosca-loader to finish succesfully and terminates it which results in a state that looks identical to when it finishes successfully. @Manojawa, have you observed something similar in your deployments? Also, running without Openstack on the tip of master (as opposed to CORD 6.0) gets me a “visible” UI without all the errors we saw on x86. Nevertheless, I still saw some bad configuration file errors coming from tosca-loader.


#9

Actually, I may have lied. The error near the tip of master (adding near because I’m not actually at the current tip) is the same as what I saw on CORD 6. Looking at the error more it may not be a problem with the configuration but could also mean that there is a missing service (e.g. something Openstack related). I would be interested in checking out @Manojawa’s logs for tosca-loader since he is running Openstack.


#10

@Jimmy - I have sent you an email with the logs of tosca-loader container and also a yaml file to start the job manually.

FYI to others in the group - Tosca-loader is a job. When the job is created (during mcord installation), it creates a container, executes the job and closes the container. During one installation, for me also the tosca-loader job failed. I do not remember the exact error, but I restarted the job and it was successful.

Best Regards,
Manoj


#11

Thanks for that explanation @Manojawa!


#12

So the error has been traced back to a dependency on some models loaded by a similar job that is part of their openstack-related container set. After loading that container set the tosca-loader job succeeded. Of course, this will present a new set of problems since I don’t have Openstack running but we’ll get there soon.