|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
|
|
selecting cases???????Hi all,
I have a data with variables year ,unit ,products. I would like to select cases which consecutive and delete cases which are not. for instance , the variable year is populated by cases ,2003,2004 and 2005 only . I wish the result to be : before after year year 20032003 20042004 20052005 20042003 20052004 20032005 2004 2005 Tthank you, Samuel. ====================To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????At 11:22 AM 6/26/2008, Samuel Solomon wrote:
>I have a data with variables year ,unit ,products. I would like to >select cases which consecutive and delete cases which are not. for >instance , the variable year is populated by cases ,2003,2004 and >2005 only . I wish the result to be : >before after > >year year >2003 2003 >2004 2004 >2005 2005 >2004 2003 >2005 2004 >2003 2005 >2004 >2005 I think you've had no answer because your question is confusing. It looks like the two columns are separate datasets, 'before' and 'after'. That probably confused a lot of people (it confused me), because parallel columns almost invariably different *variables* in the *same* dataset. Let's see if I now understand what you want. I've added a variable, 'ID', to identify the cases uniquely, as the date ('year') does not. That's another respect in which your question is pretty confusing. Starting data ID year A 2003 B 2004 C 2005 D 2004 E 2005 F 2003 G 2004 H 2005 You say you "would like to select cases which consecutive and delete cases which are not." So, you want to keep A, B and C, not D or E, and keep F, G, and H? Like this? ID year A 2003 B 2004 C 2005 F 2003 G 2004 H 2005 If that's not right, please clarify; if it is, say so, and maybe somebody can help you. Please respond to the list. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????That is exactly what I was trying to say.
sorry for not being articulate on my question. I was referring the 'before' as starting data and the 'after' the resulting data. is there a way? thank you , Samuel. From: Richard Ristow Sent: Fri 6/27/2008 9:40 PM To: SPSSX-L@... Subject: Re: selecting cases??????? At 11:22 AM 6/26/2008, Samuel Solomon wrote: >I have a data with variables year ,unit ,products. I would like to >select cases which consecutive and delete cases which are not. for >instance , the variable year is populated by cases ,2003,2004 and >2005 only . I wish the result to be : >before after > >year year >2003 2003 >2004 2004 >2005 2005 >2004 2003 >2005 2004 >2003 2005 >2004 >2005 I think you've had no answer because your question is confusing. It looks like the two columns are separate datasets, 'before' and 'after'. That probably confused a lot of people (it confused me), because parallel columns almost invariably different *variables* in the *same* dataset. Let's see if I now understand what you want. I've added a variable, 'ID', to identify the cases uniquely, as the date ('year') does not. That's another respect in which your question is pretty confusing. Starting data ID year A 2003 B 2004 C 2005 D 2004 E 2005 F 2003 G 2004 H 2005 You say you "would like to select cases which consecutive and delete cases which are not." So, you want to keep A, B and C, not D or E, and keep F, G, and H? Like this? ID year A 2003 B 2004 C 2005 F 2003 G 2004 H 2005 If that's not right, please clarify; if it is, say so, and maybe somebody can help you. Please respond to the list. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????At 11:22 AM 6/26/2008, Samuel Solomon wrote:
>I would like to select cases which consecutive and delete >cases which are not. I'd asked, does that mean that if you start with >ID year > A 2003 > B 2004 > C 2005 > D 2004 > E 2005 > F 2003 > G 2004 > H 2005 you want, >ID year > A 2003 > B 2004 > C 2005 > F 2003 > G 2004 > H 2005 At 01:56 AM 6/30/2008, Samuel Solomon wrote: >That is exactly what I was trying to say. Good. Now, what is your rule for this selection? It *looks* like when you have a record for 2003, then you keep it; and you keep the next record if it's for 2004, and the next if it's for 2005, etc. But if a record isn't for the consecutive year after its predecessor, it's dropped unless it's for 2003; and then, all later records are dropped until you get one for 2003. Well? ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????At 07:52 AM 7/1/2008, Richard Ristow wrote:
>At 11:22 AM 6/26/2008, Samuel Solomon wrote: > >>I would like to select cases which consecutive and delete >>cases which are not. > >I'd asked, does that mean that if you start with >>ID year >> A 2003 >> B 2004 >> C 2005 >> D 2004 >> E 2005 >> F 2003 >> G 2004 >> H 2005 > you want, >>ID year >> A 2003 >> B 2004 >> C 2005 >> F 2003 >> G 2004 >> H 2005 > >At 01:56 AM 6/30/2008, Samuel Solomon wrote: > >>That is exactly what I was trying to say. > >Good. Now, what is your rule for this selection? It *looks* like when >you have a record for 2003, then you keep it; and you keep the next >record if it's for 2004, and the next if it's for 2005, etc. But if a >record isn't for the consecutive year after its predecessor, it's >dropped unless it's for 2003; and then, all later records are dropped >until you get one for 2003. There's more than this going on. It looks like you have some implicit variables that should be made explicit. For example, record F does not follow *consecutively* after record C-- unless the consecutivity counter is re-set. Here are some possible implicit rules-- which are true? * The range of years is 2003 to 2005. * "Consecutive" for the last year of the range (i.e., 2005) means that the next record must be the first year of the range (i.e., 2003). I.e., the year following 2005 must be 2003, and records for other years following 2005 should be deleted until a record for 2003 is found, which re-sets the cycle. * Records may not be re-arranged (i.e., re-ordered), but only deleted. There may be other implicit rules, too. I notice that Richard's example just happens to have the first 3 records in proper order, forming a paradigm, and that records D&E look like a defective triad (lacking the record for 2003) but then lo and behold F, G, and H are consecutive and complete. I find myself wondering about the source of the original order. In other words, What if the original set of records was >>ID year >> A 2003 >> B 2004 >> C 2005 >> D 2003 >> E 2004 >> F 2003 >> G 2004 >> H 2005 Would any records in this sequence need to be deleted? I have the feeling that there is significant information not yet supplied. Bob Schacht Robert M. Schacht, Ph.D. <schacht@...> Pacific Basin Rehabilitation Research & Training Center 1268 Young Street, Suite #204 Research Center, University of Hawaii Honolulu, HI 96814 ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????Now you have understood me perfectly. So what is the solution for that?
From: Richard Ristow Sent: Tue 7/1/2008 8:52 PM To: Samuel Solomon; SPSSX-L@... Subject: Re: selecting cases??????? At 11:22 AM 6/26/2008, Samuel Solomon wrote: >I would like to select cases which consecutive and delete >cases which are not. I'd asked, does that mean that if you start with >ID year > A 2003 > B 2004 > C 2005 > D 2004 > E 2005 > F 2003 > G 2004 > H 2005 you want, >ID year > A 2003 > B 2004 > C 2005 > F 2003 > G 2004 > H 2005 At 01:56 AM 6/30/2008, Samuel Solomon wrote: >That is exactly what I was trying to say. Good. Now, what is your rule for this selection? It *looks* like when you have a record for 2003, then you keep it; and you keep the next record if it's for 2004, and the next if it's for 2005, etc. But if a record isn't for the consecutive year after its predecessor, it's dropped unless it's for 2003; and then, all later records are dropped until you get one for 2003. Well? ====================To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????All I need to delete is cases 'D' and 'E' and hence the sequence (2003,2004,2005,2003,2004,2005,..........it goes like that) is kept intact. You were wondering if the range is between 2003 and 2005. well that is right. there are only 2003,2004 and 2005. I just need to keep the sequence and delete the rest which are in between that does follow the order.
Thank you, Samuel From: Bob Schacht Sent: Tue 7/1/2008 10:27 PM To: SPSSX-L@... Subject: Re: selecting cases??????? At 07:52 AM 7/1/2008, Richard Ristow wrote: >At 11:22 AM 6/26/2008, Samuel Solomon wrote: > >>I would like to select cases which consecutive and delete >>cases which are not. > >I'd asked, does that mean that if you start with >>ID year >> A 2003 >> B 2004 >> C 2005 >> D 2004 >> E 2005 >> F 2003 >> G 2004 >> H 2005 > you want, >>ID year >> A 2003 >> B 2004 >> C 2005 >> F 2003 >> G 2004 >> H 2005 > >At 01:56 AM 6/30/2008, Samuel Solomon wrote: > >>That is exactly what I was trying to say. > >Good. Now, what is your rule for this selection? It *looks* like when >you have a record for 2003, then you keep it; and you keep the next >record if it's for 2004, and the next if it's for 2005, etc. But if a >record isn't for the consecutive year after its predecessor, it's >dropped unless it's for 2003; and then, all later records are dropped >until you get one for 2003. There's more than this going on. It looks like you have some implicit variables that should be made explicit. For example, record F does not follow *consecutively* after record C-- unless the consecutivity counter is re-set. Here are some possible implicit rules-- which are true? * The range of years is 2003 to 2005. * "Consecutive" for the last year of the range (i.e., 2005) means that the next record must be the first year of the range (i.e., 2003). I.e., the year following 2005 must be 2003, and records for other years following 2005 should be deleted until a record for 2003 is found, which re-sets the cycle. * Records may not be re-arranged (i.e., re-ordered), but only deleted. There may be other implicit rules, too. I notice that Richard's example just happens to have the first 3 records in proper order, forming a paradigm, and that records D&E look like a defective triad (lacking the record for 2003) but then lo and behold F, G, and H are consecutive and complete. I find myself wondering about the source of the original order. In other words, What if the original set of records was >>ID year >> A 2003 >> B 2004 >> C 2005 >> D 2003 >> E 2004 >> F 2003 >> G 2004 >> H 2005 Would any records in this sequence need to be deleted? I have the feeling that there is significant information not yet supplied. Bob Schacht Robert M. Schacht, Ph.D. <schacht@...> Pacific Basin Rehabilitation Research & Training Center 1268 Young Street, Suite #204 Research Center, University of Hawaii Honolulu, HI 96814 ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
data preparationI'm working currently with a huge dataset. in order to do any analysis I
need to make a lot of adjustments. my complete syntax-file needs 5! hours to run. is there a way to speed it up? I'm currently using spss 15 and a dual-core pc ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????Hi Samuel
> All I need to delete is cases 'D' and 'E' and hence the sequence > (2003,2004,2005,2003,2004,2005,..........it goes like that) is kept > intact. You were wondering if the range is between 2003 and 2005. > well that is right. there are only 2003,2004 and 2005. I just need to > keep the sequence and delete the rest which are in between that does > follow the order. > I have added some cases to your sample dataset just to be ready to deal with other "out of sequence" data, like 2003 followed by 2005 (see ID "O" & "P"), besides the case you presented (2004 without a 2003 before, like in ID "D" & "E"). * Sample dataset *. DATA LIST LIST/ID(A1) year(F8). BEGIN DATA A 2003 B 2004 C 2005 D 2004 E 2005 F 2003 G 2004 H 2005 I 2003 J 2004 K 2005 L 2003 M 2004 N 2005 O 2003 P 2005 END DATA. NUMERIC Flag(F8). COMPUTE Flag=(year=2003). * This part flags sequences not starting with 2003 *. DO IF Flag NE 1. - IF (year=2004) AND (LAG(year,1)=2003) Flag=1. - IF (year=2005) AND (LAG(year,2)=2003) Flag=1. END IF. * This part flags sequences starting with 2003 not followed by 2004 *. SORT CASES BY ID(D). IF (year=2003) AND (LAG(year) NE 2004) Flag=0. * Now we get rid of every flag=0 data *. EXE. /*don't eliminate it *. SELECT IF Flag=1. SORT CASES BY ID(A). DELETE VARIABLES Flag. LIST. HTH, Marta García-Granero ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: selecting cases???????Thanks Marta and everybody.
That will do! From: Marta García-Granero Sent: Wed 7/2/2008 1:35 PM To: SPSSX-L@... Subject: Re: selecting cases??????? Hi Samuel > All I need to delete is cases 'D' and 'E' and hence the sequence > (2003,2004,2005,2003,2004,2005,..........it goes like that) is kept > intact. You were wondering if the range is between 2003 and 2005. > well that is right. there are only 2003,2004 and 2005. I just need to > keep the sequence and delete the rest which are in between that does > follow the order. > I have added some cases to your sample dataset just to be ready to deal with other "out of sequence" data, like 2003 followed by 2005 (see ID "O" & "P"), besides the case you presented (2004 without a 2003 before, like in ID "D" & "E"). * Sample dataset *. DATA LIST LIST/ID(A1) year(F8). BEGIN DATA A 2003 B 2004 C 2005 D 2004 E 2005 F 2003 G 2004 H 2005 I 2003 J 2004 K 2005 L 2003 M 2004 N 2005 O 2003 P 2005 END DATA. NUMERIC Flag(F8). COMPUTE Flag=(year=2003). * This part flags sequences not starting with 2003 *. DO IF Flag NE 1. - IF (year=2004) AND (LAG(year,1)=2003) Flag=1. - IF (year=2005) AND (LAG(year,2)=2003) Flag=1. END IF. * This part flags sequences starting with 2003 not followed by 2004 *. SORT CASES BY ID(D). IF (year=2003) AND (LAG(year) NE 2004) Flag=0. * Now we get rid of every flag=0 data *. EXE. /*don't eliminate it *. SELECT IF Flag=1. SORT CASES BY ID(A). DELETE VARIABLES Flag. LIST. HTH, Marta García-Granero ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
AW: data preparationHi Christian
Did you try making a production job using SPSS's Production Facility. I am not sure if this procedure is faster, but it at least allows you just continuing to work on. Hope this helps Christian -----Ursprüngliche Nachricht----- Von: SPSSX(r) Discussion [mailto:SPSSX-L@...]Im Auftrag von Christian Deindl Gesendet: Mittwoch, 2. Juli 2008 12:07 An: SPSSX-L@... Betreff: data preparation I'm working currently with a huge dataset. in order to do any analysis I need to make a lot of adjustments. my complete syntax-file needs 5! hours to run. is there a way to speed it up? I'm currently using spss 15 and a dual-core pc ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: AW: data preparationthe problem is I need a prepared dataset to continue.
more or less what I'm doing at the moment is to redo the preparation and get rid of smaller mistakes, typos, etc. I'm doing all analysis in stata (GLLAMM), so I could run my models while SPSS is busy, it's just I don't have my dataset ready. and every mistake I detect can result in 5 hour waiting. christian la volta statistics schrieb: > Hi Christian > > Did you try making a production job using SPSS's Production Facility. I am > not sure if this procedure is faster, but it at least allows you just > continuing to work on. > Hope this helps > Christian > > > > -----Ursprüngliche Nachricht----- > Von: SPSSX(r) Discussion [mailto:SPSSX-L@...]Im Auftrag von > Christian Deindl > Gesendet: Mittwoch, 2. Juli 2008 12:07 > An: SPSSX-L@... > Betreff: data preparation > > > I'm working currently with a huge dataset. in order to do any analysis I > need to make a lot of adjustments. > my complete syntax-file needs 5! hours to run. > > is there a way to speed it up? > > I'm currently using spss 15 and a dual-core pc > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@... (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: AW: data preparationYou should tell us more details about the data file and the syntax. There are cases which can be easily speedied up, and there are cases where the only solution is a better computer + more specific software.
General advices: * try to split the file into smaller pieces, use only those you need for a given task and handle the pieces separately (both separate variables and separate groups of cases are possible, depending on the task) * use EXECUTE only in case of necessity * save the raw data in SPSS sav files if possible * run the time-consuming tasks overnight * try to understand why the syntax is so slow - what is the most time-consuming part of it? Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Christian Deindl Sent: Wednesday, July 02, 2008 3:06 PM To: SPSSX-L@... Subject: Re: AW: data preparation the problem is I need a prepared dataset to continue. more or less what I'm doing at the moment is to redo the preparation and get rid of smaller mistakes, typos, etc. I'm doing all analysis in stata (GLLAMM), so I could run my models while SPSS is busy, it's just I don't have my dataset ready. and every mistake I detect can result in 5 hour waiting. christian la volta statistics schrieb: > Hi Christian > > Did you try making a production job using SPSS's Production Facility. > I am not sure if this procedure is faster, but it at least allows you > just continuing to work on. > Hope this helps > Christian > > > > -----Ursprüngliche Nachricht----- > Von: SPSSX(r) Discussion [mailto:SPSSX-L@...]Im Auftrag > von Christian Deindl > Gesendet: Mittwoch, 2. Juli 2008 12:07 > An: SPSSX-L@... > Betreff: data preparation > > > I'm working currently with a huge dataset. in order to do any analysis > I need to make a lot of adjustments. > my complete syntax-file needs 5! hours to run. > > is there a way to speed it up? > > I'm currently using spss 15 and a dual-core pc > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@... (not to SPSSX-L), with no body text except > the command. To leave the list, send the command SIGNOFF SPSSX-L For a > list of commands to manage subscriptions, send the command INFO > REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _____________ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. Are you sure that you really need a print version of this message and/or its attachments? Think about nature. -.- -- ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: AW: data preparationProduction Mode will not speed anything up, however, if you don't need the user interface and can run your syntax through a programmability job, it might make a large difference. While I can't offer any guarantees, we have had reports from users of between 4x and 10x speed improvement when taking this approach.
Converting a large syntax job to run as a program can be very simple. If you have a block of syntax - let's assume it is in a file called clean.sps in temp, you can run this as a Python program (assuming the plugin is installed), by just doing, from a Python shell (notice the forward slashes), import spss spss.Submit("INSERT FILE='c:/temp/clean.sps'") The one problem with this is that you would just get the output as text streaming back to the console. So you could get this as html output with oms by wrapping your syntax with something like oms /destination outfile='c:/temp/clean.htm' format = html. <your syntax file> omsend. You could put that syntax in the Submit call above or add the OMS commands to your INSERT file. You might also want to call spss.SetOutput("off") ahead of the Submit to suppress echoing of the text to the console. If you have SPSS 16, you can generate a true Viewer document (spv format) from OMS instead of the html. As always, there are lots of programmability references at SPSS Developer Central (www.spss.com/devcentral). HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Spousta Jan Sent: Wednesday, July 02, 2008 7:22 AM To: SPSSX-L@... Subject: Re: [SPSSX-L] AW: data preparation You should tell us more details about the data file and the syntax. There are cases which can be easily speedied up, and there are cases where the only solution is a better computer + more specific software. General advices: * try to split the file into smaller pieces, use only those you need for a given task and handle the pieces separately (both separate variables and separate groups of cases are possible, depending on the task) * use EXECUTE only in case of necessity * save the raw data in SPSS sav files if possible * run the time-consuming tasks overnight * try to understand why the syntax is so slow - what is the most time-consuming part of it? Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Christian Deindl Sent: Wednesday, July 02, 2008 3:06 PM To: SPSSX-L@... Subject: Re: AW: data preparation the problem is I need a prepared dataset to continue. more or less what I'm doing at the moment is to redo the preparation and get rid of smaller mistakes, typos, etc. I'm doing all analysis in stata (GLLAMM), so I could run my models while SPSS is busy, it's just I don't have my dataset ready. and every mistake I detect can result in 5 hour waiting. christian la volta statistics schrieb: > Hi Christian > > Did you try making a production job using SPSS's Production Facility. > I am not sure if this procedure is faster, but it at least allows you > just continuing to work on. > Hope this helps > Christian > > > > -----Ursprüngliche Nachricht----- > Von: SPSSX(r) Discussion [mailto:SPSSX-L@...]Im Auftrag > von Christian Deindl > Gesendet: Mittwoch, 2. Juli 2008 12:07 > An: SPSSX-L@... > Betreff: data preparation > > > I'm working currently with a huge dataset. in order to do any analysis > I need to make a lot of adjustments. > my complete syntax-file needs 5! hours to run. > > is there a way to speed it up? > > I'm currently using spss 15 and a dual-core pc > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@... (not to SPSSX-L), with no body text except > the command. To leave the list, send the command SIGNOFF SPSSX-L For a > list of commands to manage subscriptions, send the command INFO > REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _____________ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. Are you sure that you really need a print version of this message and/or its attachments? Think about nature. -.- -- ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
|
Re: AW: data preparationthanks a lot for your answers,
I will try python for sure. since my syntax is realy large I split it into smaller parts and use "include" to put it all together. If I run the same syntax as huge file without include, it gets about 2 to 3 times faster. how is this possible? christian Peck, Jon schrieb: > Production Mode will not speed anything up, however, if you don't need the user interface and can run your syntax through a programmability job, it might make a large difference. While I can't offer any guarantees, we have had reports from users of between 4x and 10x speed improvement when taking this approach. > > Converting a large syntax job to run as a program can be very simple. > > If you have a block of syntax - let's assume it is in a file called clean.sps in temp, you can run this as a Python program (assuming the plugin is installed), by just doing, from a Python shell (notice the forward slashes), > > import spss > spss.Submit("INSERT FILE='c:/temp/clean.sps'") > > The one problem with this is that you would just get the output as text streaming back to the console. So you could get this as html output with oms by wrapping your syntax with something like > > oms /destination outfile='c:/temp/clean.htm' format = html. > <your syntax file> > omsend. > > You could put that syntax in the Submit call above or add the OMS commands to your INSERT file. > > You might also want to call > spss.SetOutput("off") > ahead of the Submit to suppress echoing of the text to the console. > > If you have SPSS 16, you can generate a true Viewer document (spv format) from OMS instead of the html. > > As always, there are lots of programmability references at SPSS Developer Central (www.spss.com/devcentral). > > HTH, > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Spousta Jan > Sent: Wednesday, July 02, 2008 7:22 AM > To: SPSSX-L@... > Subject: Re: [SPSSX-L] AW: data preparation > > You should tell us more details about the data file and the syntax. There are cases which can be easily speedied up, and there are cases where the only solution is a better computer + more specific software. > > General advices: > * try to split the file into smaller pieces, use only those you need for a given task and handle the pieces separately (both separate variables and separate groups of cases are possible, depending on the task) > * use EXECUTE only in case of necessity > * save the raw data in SPSS sav files if possible > * run the time-consuming tasks overnight > * try to understand why the syntax is so slow - what is the most time-consuming part of it? > > Jan > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Christian Deindl > Sent: Wednesday, July 02, 2008 3:06 PM > To: SPSSX-L@... > Subject: Re: AW: data preparation > > the problem is I need a prepared dataset to continue. > > more or less what I'm doing at the moment is to redo the preparation and get rid of smaller mistakes, typos, etc. > > I'm doing all analysis in stata (GLLAMM), so I could run my models while SPSS is busy, it's just I don't have my dataset ready. > and every mistake I detect can result in 5 hour waiting. > > christian > > la volta statistics schrieb: >> Hi Christian >> >> Did you try making a production job using SPSS's Production Facility. >> I am not sure if this procedure is faster, but it at least allows you >> just continuing to work on. >> Hope this helps >> Christian >> >> >> >> -----Ursprüngliche Nachricht----- >> Von: SPSSX(r) Discussion [mailto:SPSSX-L@...]Im Auftrag >> von Christian Deindl >> Gesendet: Mittwoch, 2. Juli 2008 12:07 >> An: SPSSX-L@... >> Betreff: data preparation >> >> >> I'm working currently with a huge dataset. in order to do any analysis >> I need to make a lot of adjustments. >> my complete syntax-file needs 5! hours to run. >> >> is there a way to speed it up? >> >> I'm currently using spss 15 and a dual-core pc >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> LISTSERV@... (not to SPSSX-L), with no body text except >> the command. To leave the list, send the command SIGNOFF SPSSX-L For a >> list of commands to manage subscriptions, send the command INFO >> REFCARD >> >> >> > > ===================== > To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > > > > _____________ > Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. > > Jste si jis |