Why Pymechanical Remote gives a solver error on HPC- Linux CentOS 7?
I have had this error for over a month now and couldn't find a solution on the Pyansys Githut repo.
Anytime I run Pymechanical in remote mode it has a problem with solver.
When it gets to the line for the solver to solve the analysis, it produces an error.
This is a simple case from the Pymechanical documentation example:
Kindly follow the link to see the Pymechanical Code I ran.
I copied all the code into a python script and ran it on a HPC.
I already installed ansys-mechanical-core in my virtual environment.
I noticed that when I removed the line
Solve static analysis.
STAT_STRUC.Solution.Solve(True)
Result Without SOlver Line
(Pyansys) [HPC ~]$ python /users/Projects/Cantilever-Project/HPC_Test.py
Downloaded the geometry file to: /users/.local/share/ansys_mechanical_core/examples/example_01_geometry.agdb
Initialize() started
Initialize() done
Ansys Mechanical [Ansys Mechanical Enterprise]
Product Version:232
Software build date: 05/30/2023 05:40:47
project directory = /users/.ansys/AnsysMech4B92/Project_Mech_Files/
Uploading example_01_geometry.agdb to 127.0.0.1:10000:/users/.ansys/AnsysMech4B92/Project_Mech_Files/.: 100%|███████████████████████████████████████████████████| 17.0k/17.0k [00:00<00:00, 3.92MB/s]
part_file_path on server: /users/.ansys/AnsysMech4B92/Project_Mech_Files/example_01_geometry.agdb
exit code: 0
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:48:39 2024
AnsMeshingServer, okay, Mon Apr 29 10:48:40 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:48:40 2024
AnsMeshingServer, okay, Mon Apr 29 10:48:41 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:48:42 2024
AnsMeshingServer, okay, Mon Apr 29 10:48:42 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:48:43 2024
AnsMeshingServer, okay, Mon Apr 29 10:48:44 2024
exit code: 255
{"Maximum": "0 [m]", "Minimum": "0 [m]", "Average": "0 [m]"}
The code runs just fine without solving although without results, shwoing that the gRPC connection is okay
But with this line, the solver generates this error:
Error with SOlver Line:
(Pyansys) [HPC ~]$ python /user/Projects/Cantilever-Project/HPC_Test.py
Downloaded the geometry file to: /users/.local/share/ansys_mechanical_core/examples/example_01_geometry.agdb
Initialize() started
Initialize() done
Ansys Mechanical [Ansys Mechanical Enterprise]
Product Version:232
Software build date: 05/30/2023 05:40:47
project directory = /users/.ansys/AnsysMechD1B5/Project_Mech_Files/
Uploading example_01_geometry.agdb to 127.0.0.1:10000:/users/.ansys/AnsysMechD1B5/Project_Mech_Files/.: 100%|███████████████████████████████████████████████████| 17.0k/17.0k [00:00<00:00, 4.06MB/s]
part_file_path on server: /users/.ansys/AnsysMechD1B5/Project_Mech_Files/example_01_geometry.agdb
exit code: 0
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:51:18 2024
AnsMeshingServer, okay, Mon Apr 29 10:51:19 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:51:19 2024
AnsMeshingServer, okay, Mon Apr 29 10:51:20 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:51:21 2024
AnsMeshingServer, okay, Mon Apr 29 10:51:21 2024
exit code: 255
AnsMeshingServer, compiled May 30 2023 07:02:06, DS Mesher, Mon Apr 29 10:51:22 2024
AnsMeshingServer, okay, Mon Apr 29 10:51:23 2024
exit code: 255
Running Solver : /opt/apps/testapps/el7/software/staging/ANSYS/2023R2/v232/aisol/../ansys/bin/ansys232 -b nolist -s noread -i dummy.dat -o solve.out -dis -np 50 -p ansys
/opt/apps/testapps/el7/software/staging/ANSYS/2023R2/v232/aisol/.workbench: line 271: 13947 Killed $Pgm $Args
CRITICAL - - logging - handle_exception - Uncaught exception
Traceback (most recent call last):
File "/users/Projects/Cantilever-Project/HPC_Test.py", line 29, in
output = mechanical.run_python_script(
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/ansys/mechanical/core/mechanical.py", line 981, in run_python_script
result_as_string = self.call_run_python_script(
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/ansys/mechanical/core/mechanical.py", line 1702, in __call_run_python_script
for runscript_response in self._stub.RunPythonScript(request):
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/grpc/_channel.py", line 542, in __next
return self._next()
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/grpc/_channel.py", line 968, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Socket closed", grpc_status:14, created_time:"2024-04-29T10:52:12.118576839+01:00"}"
>
(Pyansys) [HPC ~]$ (Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
-----------------------------------------------------------------------------
| |
| RUN COMPLETED |
| |
|-----------------------------------------------------------------------------|
|
| Ansys MAPDL 2023 R2 Build 23.2 UP20230530 LINUX x64 |
| |
|
|-----------------------------------------------------------------------------|
| |
| Database Requested(-db) 1 MB Scratch Memory Requested 0 MB |
| Maximum Database Used 1 MB Maximum Scratch Memory Used 1 MB |
| |
|-----------------------------------------------------------------------------|
| |
| CP Time (sec) = 0.189 Time = 10:52:42 |
| Elapsed Time (sec) = 26.000 Date = 04/29/2024 |
| |
*--------------------------------------------------------------------------
slurmstepd: error: Detected 1624 oom_kill events in StepId=2511326.5. Some of the step tasks have been OOM Killed.
I_MPI_JOB_TIMEOUT = -1 second(s): job ending due to startup timeout
srun: error: node002: task 0: Out Of Memory
Answers
-
Can you please verify if you are able to run this job manually on this HPC? You seem to be using 50 cores.
0 -
Thank you
@Rajesh MeenaI have not tried to run this job manually but I have tried to run jobs manually and with Pymechanical and I still got exactly the same error.
How can you tell that I am using 50 cores?
0 -
@CollinsJnr Are you saying that you are not able to run any job on HPC? Is this issue specific to PyMechanical?
0 -
I am able to submit jobs (input files from Mechanical GUI) on the HPC and it runs smoothly.
But when I submit the job as Pymechanical it gives this error
0 -
This seems to be solve command being triggered:
/opt/apps/testapps/el7/software/staging/ANSYS/2023R2/v232/aisol/../ansys/bin/ansys232 -b nolist -s noread -i dummy.dat -o solve.out -dis -np 50 -p ansys
0 -
@Rajesh Meena
Yes it is.But after that,
CRITICAL - - logging - handle_exception - Uncaught exception
Traceback (most recent call last):
File "/users/Projects/Cantilever-Project/HPC_Test.py", line 29, in
output = mechanical.run_python_script(
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/ansys/mechanical/core/mechanical.py", line 981, in run_python_script
result_as_string = self.call_run_python_script(
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/ansys/mechanical/core/mechanical.py", line 1702, in __call_run_python_script
for runscript_response in self._stub.RunPythonScript(request):
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/grpc/_channel.py", line 542, in __next
return self._next()
File "/users/.conda/envs/Pyansys/lib/python3.10/site-packages/grpc/_channel.py", line 968, in _next
raise selfAnd the gRPC connection closes
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "Socket closed" debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Socket closed", grpc_status:14, created_time:"2024-04-29T10:52:12.118576839+01:00"}" >
0 -
Any idea what the problem might be?
0 -
Any ideas as to what the problem could be?
Any help would be highly appreciated.
0 -
@CollinsJnr It seems you have already reported this in Github (https://github.com/ansys/pymechanical/issues/694) where the issue is marked as closed. Can you please clarify?
0 -
@Pernelle Marone-Hitz
Thank you.The conclusion on github was that it is not a Pymechanical problem, so I had to close the issue and post it here.
0 -
Thanks for the info @CollinsJnr. My understanding from the Github issue is that the issue is not at all related to code, since it also happens when you are running from the UI. This forum is dedicated to scripting questions, so you should not have been routed here.
If you are a paying customer I would advise you to contact your local support provider and simply mention the issue that you are having through the UI. If you do not have access to official support, you can create a post in this other forum, which takes care of all other aspects except scripting: https://forum.ansys.com/1 -
Thank you, @Pernelle Marone-Hitz
So I loaded the Ansys GUI on the HPC and I noticed that the material is not showing
Kindly see picture below
meanwhile I can see the materials in the Engineering Data Sources
Is there a material loading setting to see all available materials in Mechanical?
0 -
@CollinsJnr We can only help with scripting questions on this forum. This is likely an installation issue. Please contact your local support provider, or create a post in this other forum, which takes care of all other aspects except scripting: https://forum.ansys.com/
0 -
Thank you so much.
Super grateful for your continued help and support.0