Sponsors
Sponsor Products
Deadline and Network Gurus - 3dsmax2012 crashing on the render farm when rendering out large FumeFX intensive.
posted by Adam Kane  on Aug. 9, 2011, 3:05 p.m. (8 years, 2 months, 10 days ago)
12 Responses     0 Plus One's     0 Comments  
Have you tried hitting up the forums over at Thinkbox software? ?The developers are really good at helping troubleshoot any issues with deadline.
http://forums.thinkboxsoftware.com/

On Tue, Aug 9, 2011 at 11:58 AM, Dax Grove <dw.grove@gmail.com> wrote:
I'll get straight to the point.
3dsmax 2012 Sp1 with both hotfixes on workstation and on farmdeadline 5fumefx.2.1 and fumefx2.1sl on farm

I am trying to render a large 3dsmax scene file. When I submit it to the farm, the render goes fine?until after it begins to try and load atmospherics in fumefx which are about 800mb per. Then 3dsmax will crash, and that frame will fail. Now heres where the confusing part comes in. When I try and resubmit the job to just one machine, it renders out okay, but when it's across the 20 render machines it errors about 500 times. Also, I resubmitted the job without atmospherics enabled and the render goes fine. I have tried to disable in deadline where it closes 3dsmax between frames, that didn't help.

I cannot determine where the problem is, and rendering such a large file on just one machine isn't a option I want to?precede?with because it takes all day.?
Our network is built to handle this kind of workload. We have an icilon cluster which has all of the files that are being referenced, with all the machines networked through 2?pro-curve?switches with jumboframes 9k enabled. I had a local company GPL help me set everything up and so far everything has been very fast without any downtime.?
Maybe it's a problem with deadline? A few months ago I was having another 3dsmax plugin problem that was resolved with a repository amendment.
Any questions or sujestions please feel free to place your input. My guess it's deadline5 having a bug with this particular situation. Or simply, something might be wrong in the network where when all the machines are trying to reference this 800mb fumefile, 3dsmax crashes.?


Thread Tags:
  discuss-at-studiosysadmins 

Response from ElectricJesus @ Aug. 10, 2011, 3:30 p.m.
I am usually re submitting as a render user. But I've resubmitted from my roaming profile account and it's still the same error.
I have just determined that the problem is with Afterburn, and seeing as there is only one new version for 2012 I?don't?have very many options.
http://forum.cgpersia.com/f32/afterburn-render-error-17024/?
seems like this guy is having the same issue, but his problem is in 2010. Maybe its a bug that carried over?
If this really keeps up I am going to have to render it without afterburn, which is everything, then render afterburn by its self and comp it in.

On Wed, Aug 10, 2011 at 12:07 PM, <bill@yuco.com> wrote:

?Okay, how about this..

?When you submit through deadline, are you the same user as when you interactively render through 3DS MAX?


>-----Original Message-----
>From: Dax Grove [mailto:dw.grove@gmail.com]
>Sent: Wednesday, August 10, 2011 02:58 PM
>To: discuss@studiosysadmins.com
>Subject: Re: [SSA-Discuss] Deadline and Network Gurus - 3dsmax2012 crashing on the render farm when rendering out large FumeFX intensive.
>
>Hrm. what do you mean? I submitted the job through 3dsmax. Then I have been
>version uping or resubmitting the failed job through the deadline monitor.
>
>On Wed, Aug 10, 2011 at 10:56 AM, <bill@yuco.com> wrote:
>
>>
>> >From: Dax Grove [mailto:dw.grove@gmail.com]
>> >Subject: Re: [SSA-Discuss] Deadline and Network Gurus - 3dsmax2012
>> crashing on the render farm when rendering out large FumeFX intensive.
>>
>>
>>
>> >I can render the job on just one machine and it works.
>>
>> * Are you doing this render under the deadline account or as a different
>> user?*
>>
>>
>>
>




0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 10, 2011, 3 p.m.
Hrm. what do you mean? I submitted the job through 3dsmax. Then I have been version uping or resubmitting the failed job through the deadline monitor.

On Wed, Aug 10, 2011 at 10:56 AM, <bill@yuco.com> wrote:

>From: Dax Grove [mailto:dw.grove@gmail.com]
>Subject: Re: [SSA-Discuss] Deadline and Network Gurus - 3dsmax2012 crashing on the render farm when rendering out large FumeFX intensive.


>I can render the job on just one machine and it works.

Are you doing this render under the deadline account or as a different user?




0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 10, 2011, 1:50 p.m.
http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977
Nothing really so far.
I can render without atmospherics just fine. I can render the job on just one machine and it works.I can create a new 3dsmax project and throw in all the plugins im using and send it to the farm and it renders out great.
I doesn't happen in a particular frame range so it's hard to tell what is going on when it renders out.

On Wed, Aug 10, 2011 at 10:40 AM, Derrick MacPherson <derrickmacpherson@gmail.com> wrote:
what to the deadline guys say? support is usually awesome

On Wed, Aug 10, 2011 at 10:27 AM, Dax Grove <dw.grove@gmail.com> wrote:
> Okay, so the problem is back.
> http://pastebin.com/XwxmeHnH
> thats the link to the error I am getting. I am kinda lost for ideas now
> considering it was working when I left last night now this morning it is
> acting up.
>
> On Tue, Aug 9, 2011 at 3:38 PM, Dax Grove <dw.grove@gmail.com> wrote:
>>
>> I figured it out.
>> It was a permissions issue. Apparently, the /render roaming profile
>> accounts?didn't?have write privelages to the sims drive folders. When I
>> changed it, everything is now going smoothly. What I cannot figure out is
>> why when I did one machine at a time it worked.
>> thanks for the input guys.
>> Cheers,
>> Dax
>> On Tue, Aug 9, 2011 at 3:11 PM, Dax Grove <dw.grove@gmail.com> wrote:
>>>
>>> thanks.
>>>
>>> moving the sim data off the icilon cluster didn't help.
>>> On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen <mjnowen@gmail.com> wrote:
>>>>
>>>> Hi Dax,
>>>> Please see the multiple replies to your Thinkbox support forum post
>>>> here:
>>>> http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977
>>>> Regards,
>>>> Mike
>>>>
>>>> On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:
>>>>
>>>> On 2011-08-09, at 3:48 PM, Dax Grove wrote:
>>>>
>>>> UPDATE: So duplicated the job 7 times and assigned each task to a
>>>> separate machine on a different frame range. After submitting, the machines
>>>> once again errored out which is leading me to believe it has something to do
>>>> with accessing that fumefx file.
>>>>
>>>> We don't use 3DSMax or FumeFX but ..
>>>> Quite honestly it sounds like a deterministic problem. ?In most particle
>>>> systems you cannot determine what will happen in frame X without knowing
>>>> what is happening in frame X-1.
>>>> I'm just not clear on whether the atmospherics in the scene are being
>>>> determined at run time or whether they are cached already.
>>>> Just a thought.
>>>> Cheers,
>>>> Todd Smith
>>>> Head of Information Technology
>>>> soho vfx?|
>>>> 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8
>>>> office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com
>>>
>>
>
>


0 Plus One's     0 Comments  
   

Response from Derrick MacPherson @ Aug. 10, 2011, 1:45 p.m.
what to the deadline guys say? support is usually awesome On Wed, Aug 10, 2011 at 10:27 AM, Dax Grove wrote: > Okay, so the problem is back. > http://pastebin.com/XwxmeHnH > thats the link to the error I am getting. I am kinda lost for ideas now > considering it was working when I left last night now this morning it is > acting up. > > On Tue, Aug 9, 2011 at 3:38 PM, Dax Grove wrote: >> >> I figured it out. >> It was a permissions issue. Apparently, the /render roaming profile >> accounts?didn't?have write privelages to the sims drive folders. When I >> changed it, everything is now going smoothly. What I cannot figure out is >> why when I did one machine at a time it worked. >> thanks for the input guys. >> Cheers, >> Dax >> On Tue, Aug 9, 2011 at 3:11 PM, Dax Grove wrote: >>> >>> thanks. >>> >>> moving the sim data off the icilon cluster didn't help. >>> On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen wrote: >>>> >>>> Hi Dax, >>>> Please see the multiple replies to your Thinkbox support forum post >>>> here: >>>> http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977 >>>> Regards, >>>> Mike >>>> >>>> On 9 Aug 2011, at 20:59, Todd Smith wrote: >>>> >>>> On 2011-08-09, at 3:48 PM, Dax Grove wrote: >>>> >>>> UPDATE: So duplicated the job 7 times and assigned each task to a >>>> separate machine on a different frame range. After submitting, the machines >>>> once again errored out which is leading me to believe it has something to do >>>> with accessing that fumefx file. >>>> >>>> We don't use 3DSMax or FumeFX but .. >>>> Quite honestly it sounds like a deterministic problem. ?In most particle >>>> systems you cannot determine what will happen in frame X without knowing >>>> what is happening in frame X-1. >>>> I'm just not clear on whether the atmospherics in the scene are being >>>> determined at run time or whether they are cached already. >>>> Just a thought. >>>> Cheers, >>>> Todd Smith >>>> Head of Information Technology >>>> soho vfx?| >>>> 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8 >>>> office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com >>> >> > >

0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 10, 2011, 1:30 p.m.
Okay, so the problem is back.
http://pastebin.com/XwxmeHnH
thats the link to the error I am getting. I am kinda lost for ideas now considering it was working when I left last night now this morning it is acting up.

On Tue, Aug 9, 2011 at 3:38 PM, Dax Grove <dw.grove@gmail.com> wrote:
I figured it out.
It was a permissions issue. Apparently, the /render roaming profile accounts?didn't?have write privelages to the sims drive folders. When I changed it, everything is now going smoothly. What I cannot figure out is why when I did one machine at a time it worked.
thanks for the input guys.
Cheers,Dax
On Tue, Aug 9, 2011 at 3:11 PM, Dax Grove <dw.grove@gmail.com> wrote:
thanks.

moving the sim data off the icilon cluster didn't help.
On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen <mjnowen@gmail.com> wrote:
Hi Dax,Please see the multiple replies to your Thinkbox support forum post here: http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977Regards,Mike

On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:

On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem. ?In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1. ? I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd Smith Head of Information Technology
soho vfx?|? 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com




0 Plus One's     0 Comments  
   

Response from Joseph Boswell @ Aug. 9, 2011, 6:40 p.m.
Sounds like a lot of data passing over the network is all. Could be a Deadline timeout maybe? ?Not sure what the defaults are but iirc there is a scene load timeout.
When you say you rendered it on one machine, do you mean you just whitelisted one machine in Deadline or you locally rendered it? You might try just whitelisting 1/2 the farm, then 1/4 till you get down to a number of machines that will indeed render. Assuming your render nodes are on a 1gbe switch is it a 1gbe trunk to the switch with your isilons? ?Also how many Isilon nodes are you actually running, and are you running the SmartConnect or whatever it is (the load balancer)? Are your render nodes mapping to just a single Isilon node? What do the graphs on your isilon look like in the web GUI? Or for that matter if your switches are managed with interfaces what do those look like? Have you looked at the task manager on the render nodes to see what the network/cpu is doing?
I run FumeFX stuff through the farm all day long here, no issues at all. No idea what size the caches are the artists are working with though. Not too sure if they are sending much 2012 stuff off yet but certainly 2011.
Joe
On Tue, Aug 9, 2011 at 3:11 PM, Dax Grove <dw.grove@gmail.com> wrote:
thanks.

moving the sim data off the icilon cluster didn't help.
On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen <mjnowen@gmail.com> wrote:
Hi Dax,Please see the multiple replies to your Thinkbox support forum post here: http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977Regards,Mike

On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:

On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem. ?In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1. ? I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd Smith Head of Information Technology
soho vfx?|? 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com



0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 9, 2011, 6:40 p.m.
I figured it out.
It was a permissions issue. Apparently, the /render roaming profile accounts?didn't?have write privelages to the sims drive folders. When I changed it, everything is now going smoothly. What I cannot figure out is why when I did one machine at a time it worked.
thanks for the input guys.
Cheers,Dax
On Tue, Aug 9, 2011 at 3:11 PM, Dax Grove <dw.grove@gmail.com> wrote:
thanks.

moving the sim data off the icilon cluster didn't help.
On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen <mjnowen@gmail.com> wrote:
Hi Dax,Please see the multiple replies to your Thinkbox support forum post here: http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977Regards,Mike

On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:

On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem. ?In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1. ? I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd Smith Head of Information Technology
soho vfx?|? 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com



0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 9, 2011, 6:15 p.m.
thanks.

moving the sim data off the icilon cluster didn't help.
On Tue, Aug 9, 2011 at 2:12 PM, Mike Owen <mjnowen@gmail.com> wrote:
Hi Dax,Please see the multiple replies to your Thinkbox support forum post here: http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977Regards,Mike

On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:

On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem. ?In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1. ? I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd Smith Head of Information Technology
soho vfx?|? 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office:?(416) 516-7863?fax:?(416) 516-9682?web:?sohovfx.com


0 Plus One's     0 Comments  
   

Response from Mike Owen @ Aug. 9, 2011, 5:15 p.m.
Hi Dax,Please see the multiple replies to your Thinkbox support forum post here:http://forums.thinkboxsoftware.com/viewtopic.php?f=11&t=5977Regards,Mike

On 9 Aug 2011, at 20:59, Todd Smith <todd@sohovfx.com> wrote:

On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem.  In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1.  I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd SmithHead of Information Technology
soho vfx | 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office: (416) 516-7863 fax: (416) 516-9682 web: sohovfx.com

0 Plus One's     0 Comments  
   

Response from Todd Smith @ Aug. 9, 2011, 4 p.m.
On 2011-08-09, at 3:48 PM, Dax Grove wrote:
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

We don't use 3DSMax or FumeFX but ..
Quite honestly it sounds like a deterministic problem.  In most particle systems you cannot determine what will happen in frame X without knowing what is happening in frame X-1.  I'm just not clear on whether the atmospherics in the scene are being determined at run time or whether they are cached already.
Just a thought.
Cheers, Todd SmithHead of Information Technology
soho vfx | 99 Atlantic Ave. Suite 303, Toronto, Ontario M6K 3J8office: (416) 516-7863 fax: (416) 516-9682 web: sohovfx.com

0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 9, 2011, 3:50 p.m.
UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

On Tue, Aug 9, 2011 at 12:06 PM, Dax Grove <dw.grove@gmail.com> wrote:
I just registered and posted. Hopefully I'll get something.

On Tue, Aug 9, 2011 at 12:02 PM, Adam Kane <adamckane@gmail.com> wrote:
Have you tried hitting up the forums over at Thinkbox software? ?The developers are really good at helping troubleshoot any issues with deadline.
http://forums.thinkboxsoftware.com/

On Tue, Aug 9, 2011 at 11:58 AM, Dax Grove <dw.grove@gmail.com> wrote:
I'll get straight to the point.
3dsmax 2012 Sp1 with both hotfixes on workstation and on farmdeadline 5fumefx.2.1 and fumefx2.1sl on farm

I am trying to render a large 3dsmax scene file. When I submit it to the farm, the render goes fine?until after it begins to try and load atmospherics in fumefx which are about 800mb per. Then 3dsmax will crash, and that frame will fail. Now heres where the confusing part comes in. When I try and resubmit the job to just one machine, it renders out okay, but when it's across the 20 render machines it errors about 500 times. Also, I resubmitted the job without atmospherics enabled and the render goes fine. I have tried to disable in deadline where it closes 3dsmax between frames, that didn't help.

I cannot determine where the problem is, and rendering such a large file on just one machine isn't a option I want to?precede?with because it takes all day.?
Our network is built to handle this kind of workload. We have an icilon cluster which has all of the files that are being referenced, with all the machines networked through 2?pro-curve?switches with jumboframes 9k enabled. I had a local company GPL help me set everything up and so far everything has been very fast without any downtime.?
Maybe it's a problem with deadline? A few months ago I was having another 3dsmax plugin problem that was resolved with a repository amendment.
Any questions or sujestions please feel free to place your input. My guess it's deadline5 having a bug with this particular situation. Or simply, something might be wrong in the network where when all the machines are trying to reference this 800mb fumefile, 3dsmax crashes.?




0 Plus One's     0 Comments  
   

Response from ElectricJesus @ Aug. 9, 2011, 3:10 p.m.
I just registered and posted. Hopefully I'll get something.

On Tue, Aug 9, 2011 at 12:02 PM, Adam Kane <adamckane@gmail.com> wrote:
Have you tried hitting up the forums over at Thinkbox software? ?The developers are really good at helping troubleshoot any issues with deadline.
http://forums.thinkboxsoftware.com/

On Tue, Aug 9, 2011 at 11:58 AM, Dax Grove <dw.grove@gmail.com> wrote:
I'll get straight to the point.
3dsmax 2012 Sp1 with both hotfixes on workstation and on farmdeadline 5fumefx.2.1 and fumefx2.1sl on farm

I am trying to render a large 3dsmax scene file. When I submit it to the farm, the render goes fine?until after it begins to try and load atmospherics in fumefx which are about 800mb per. Then 3dsmax will crash, and that frame will fail. Now heres where the confusing part comes in. When I try and resubmit the job to just one machine, it renders out okay, but when it's across the 20 render machines it errors about 500 times. Also, I resubmitted the job without atmospherics enabled and the render goes fine. I have tried to disable in deadline where it closes 3dsmax between frames, that didn't help.

I cannot determine where the problem is, and rendering such a large file on just one machine isn't a option I want to?precede?with because it takes all day.?
Our network is built to handle this kind of workload. We have an icilon cluster which has all of the files that are being referenced, with all the machines networked through 2?pro-curve?switches with jumboframes 9k enabled. I had a local company GPL help me set everything up and so far everything has been very fast without any downtime.?
Maybe it's a problem with deadline? A few months ago I was having another 3dsmax plugin problem that was resolved with a repository amendment.
Any questions or sujestions please feel free to place your input. My guess it's deadline5 having a bug with this particular situation. Or simply, something might be wrong in the network where when all the machines are trying to reference this 800mb fumefile, 3dsmax crashes.?



0 Plus One's     0 Comments