CS 162 - Nachos Operating SystemAlthough it was the toughest project I did (for class, not including research), I really enjoyed doing it. The goal of this project is to build an operating system with virtual memory, network layer, a terminal and many more.Lecture Notes PhasesProject 1Simulation on many riders and many elevators with threads ![]() Project 2 We implemented file IO system call, lottery Scheduler Project 3 A program has infinite virtual memory provided by nachos, we can run c program from our nachos shell ![]() Project 4 We implement TCP like network transport Layer and we write a chat program on top of that ![]() AutograderThere is an invicible Autograder. I don't know how many times we submit.. All we hoped for is thisFrom cs162-ra@pasteur.EECS.Berkeley.EDU Wed May 15 05:00:20 2002 Received: from pulsar.CS.Berkeley.EDU (pulsar.CS.Berkeley.EDU [128.32.42.24]) by pasteur.EECS.Berkeley.EDU (8.9.3+Sun/8.9.1) with ESMTP id FAA12337; Wed, 15 May 2002 05:00:19 -0700 (PDT) Received: (from cs162-ra@localhost) by pulsar.CS.Berkeley.EDU (8.10.2+Sun/8.10.2) id g4FC0KO11629; Wed, 15 May 2002 05:00:20 -0700 (PDT) Date: Wed, 15 May 2002 05:00:20 -0700 (PDT) From: Grader SubmissionThis is a submission script I wrote to keep track of everyone submission and autograder result. Here is the logcs162-nx's SubmissionDate: Tue Apr 23 21:44:33 PDT 2002Code: proj3-code Description: terence: first try! AutoGrader Result: Result cs162-nx's SubmissionDate: Tue Apr 23 22:19:03 PDT 2002Code: proj3-code Description: terence: without the true flag AutoGrader Result: Result cs162-nx's SubmissionDate: Tue Apr 23 22:40:13 PDT 2002Code: proj3-code Description: terence: with the lock in rwhelper fix AutoGrader Result: Result cs162-nx's SubmissionDate: Wed Apr 24 00:15:21 PDT 2002Code: proj3-code Description: terence: my mother fucking fix on save tlb, Be ARCH AutoGrader Result: Result cs162-nx's SubmissionDate: Wed Apr 24 00:42:56 PDT 2002Code: proj3-code Description: terence: fix minor error AutoGrader Result: Result cs162-nx's SubmissionDate: Wed Apr 24 01:11:19 PDT 2002Code: proj3-code Description: terence: fix on save to tlb functions AutoGrader Result: Result cs162-nx's SubmissionDate: Thu Apr 25 15:08:01 PDT 2002Code: proj3-code Description: terence: last relieve .. i am going to sleep now AutoGrader Result: Result cs162-nx's SubmissionDate: Fri Apr 26 11:21:41 PDT 2002Code: proj3-code Description: terence: fix bug in usingmem AutoGrader Result: Result cs162-nx's SubmissionDate: Fri Apr 26 11:36:10 PDT 2002Code: proj3-code Description: terence: final submit AutoGrader Result: Result cs162-nx's SubmissionDate: Wed May 8 01:03:59 PDT 2002Code: proj4-code Description: terence: code done, lots to be done AutoGrader Result: Result cs162-nx's SubmissionDate: Wed May 8 04:33:16 PDT 2002Code: proj4-code Description: terence: not infinite loop in checkifstuffexit, AutoGrader Result: Result cs162-ez's SubmissionDate: Wed May 8 12:01:33 PDT 2002Code: proj4-code Description: tried to fix accept - EL AutoGrader Result: Result cs162-ez's SubmissionDate: Wed May 8 15:54:21 PDT 2002Code: proj4-code Description: tried to fix close AutoGrader Result: Result cs162-ez's SubmissionDate: Wed May 8 22:51:30 PDT 2002Code: proj4-code Description: thried to fix close - EL AutoGrader Result: Result cs162-ez's SubmissionDate: Wed May 8 23:40:09 PDT 2002Code: proj4-code Description: wondering why connection fails AutoGrader Result: Result cs162-ez's SubmissionDate: Thu May 9 10:14:00 PDT 2002Code: proj4-code Description: tried to fix read/write - EL AutoGrader Result: Result cs162-ez's SubmissionDate: Thu May 9 13:59:08 PDT 2002Code: proj4-code Description: tried to fix bulk data transfer AutoGrader Result: Result cs162-ez's SubmissionDate: Thu May 9 15:09:05 PDT 2002Code: proj4-code Description: trying to fix multiple connections AutoGrader Result: Result cs162-nx's SubmissionDate: Thu May 9 19:31:32 PDT 2002Code: proj4-code Description: terence: timeout: 1000, 500, 500 AutoGrader Result: Result cs162-ez's SubmissionDate: Fri May 10 05:31:57 PDT 2002Code: proj4-code Description: please work! AutoGrader Result: Result cs162-ez's SubmissionDate: Fri May 10 06:04:32 PDT 2002Code: proj4-code Description: changed timeouts to 20000, NetProcess to use NetKernel.ConnectTimeOut AutoGrader Result: Result cs162-ez's SubmissionDate: Fri May 10 09:59:30 PDT 2002Code: proj4-code Description: timeouts = 20000, connecttimeout = 20 AutoGrader Result: Result cs162-nx's SubmissionDate: Sun May 12 04:30:43 PDT 2002Code: proj4-code Description: terence: fin timeoutconnection implemented AutoGrader Result: Result cs162-nx's SubmissionDate: Sun May 12 05:15:10 PDT 2002Code: proj4-code Description: terence: connect timeout 20 AutoGrader Result: Result cs162-nx's SubmissionDate: Sun May 12 14:38:00 PDT 2002Code: proj4-code Description: terence: 24 tests ALL PASSED without fin implementation AutoGrader Result: Result cs162-ez's SubmissionDate: Tue May 14 01:23:18 PDT 2002Code: proj4-code Description: new NetKernel AutoGrader Result: Result cs162-bn's SubmissionDate: Tue May 14 22:44:30 PDT 2002Code: proj4-code Description: final final submission AutoGrader Result: Result Log FileThis is how we communicate. Instead of email back and forth, We use cvs and this log file to communicate. We do everything entirely online.
****************************************************
Terence May 17 12:04 PM
The page has moved. if you still want to edit the file
go to ~terence/cs162/log.html
the code in result.html will not work any more. i delete them all to
save space.
****************************************************
Terence May 16 2:07 PM
i am going to back up my account now, so i won't forget.
if you care, the nachos backup will be in www.inst.eecs/~terence/cs162
thanks a lot for everything. good luck on your finals
if you want to backup your file do
tar cpvf ~/`whoami`.tar ~
****************************************************
Amy May 16 1:14 AM
i submitted final design doc.
remember to do ur evaluation too.
****************************************************
Amy May 15 3:19 PM
is part 1 design doc finished?
****************************************************
Amy May 14 7:21 AM
if anyone wants to submit plz update ur nachos/test dir first
(Makefile, get.c, chat.c, chatserver.c, movearray.c,
readlinefrombuffer.c, make sure u have the newest of those)
****************************************************
Terence May 13 5:44 PM
let's get the design doc done tongiht, people! i have no time to
spend on this anymore. so yea. be online around 9?
I will keep on try to solve that last test, k?
micheal: could you do the destroy fd thing? maybe that is related to
test 106.
elaine: could you start writing design doc?
anyways, be online at 9 k?
amy and katherine: would be nice if you could tell us what's going
on..
here is the result of having timeout and different test result
http://www-inst.eecs/~cs162-nx/weird.txt
****************************************************
Terence May 12 5:02 PM
we do need to take care of that because of the timer speicfied in the
specs. but by doing this we failed two test. the autograded is loaded
right now. so let's plan to work at 2 am again, k? get some sleep
before we start work
elaine, if you want to join us...
****************************************************
Michael May 12 1:10 PM
Emil responded and said if the Fin or FinAck packet get lost it's ok.
For our first example, if A and B are established, A close and the fin
packet is lost, A is closing and B is in established. If B write, it
won't return -1 and that is correct because it still thinks he is in
the established state and the data was successfully enqueued.
I'm going to work on destroy FD right now.
****************************************************
Terence May 11 6:37 PM
one more thing to do
destroy fd
test fin/finack when there is actually data. and become -1?
micheal could you do that?
****************************************************
Michael May 11 7:20 PM
Terence: I was trying to track amy's problem. I think it might be
doSendAck. Does doSendAck really keep sending Acks if there is no
data? It might possibly speed things up, but it isn't in the protocol.
The biggest problem is, after the sendqueue on the serverside is
closed, it sends a FIN. But if doSendAck keeps sending data, you'll
get a violation. I see two possible fixes to this, don't know which
one you'll want. First possible fix is to only sendack when you get
data. So if packets are dropped, you might recover a little bit slower
because only the side sending data can keep resending. Second solution
is more complicated. We can do what we're doing for now, and then turn
dosendack off when an STP is sent/recieved. There are more little
details in the second one that I haven't completely sorted out
yet. Personally I prefer the first one because it is simpler and
follows the protocol, although we might get slightly better
performance if we try the second one.
Next question: When I clear the sendbuffer. Does this clear
everything? Or will resend data still send Data that has timed out?
****************************************************
Amy May 11 2:45 PM
ok i m having some technical difficulties now...
i was wondering for dataack....it seems like it is sent no matter what
state it is in? what i mean right here is that it seems like it is
sending an DATA_ACK even right after FIN is sent....and this gives me
a violation...
i will work on this later tonight...this is just going way to slow...
****************************************************
Terence May 11 11:27 AM
if you are suspecting there is a bug in vm stuff, change network to
extend user ... i am not really sure. (which i do agreed that vm is
buggy) could you give the vaddr that cause that? it is also possible
that it is a 'seg fault' in the c program. happen to me before (right?)
****************************************************
Amy May 11 11:02 AM
I just want to check if all the files in cvs up to date...just
checking...cuz need the latest to run against part 2...
i m getting this really weird bugs right now...not sure if it is my
implementation for part 2 or not yet...still checking right now:
java.lang.NullPointerException
at nachos.userprog.UserKernel.isEntryValid(UserKernel.java:73)
at nachos.vm.VMProcess.rwHelper(VMProcess.java:143)
at nachos.userprog.UserProcess.readVirtualMemory(UserProcess.java:249)
at nachos.userprog.UserProcess.handleWrite(UserProcess.java:938)
at nachos.userprog.UserProcess.handleSyscall(UserProcess.java:1038)
at nachos.network.NetProcess.handleSyscall(NetProcess.java:92)
at nachos.vm.VMProcess.handleException(VMProcess.java:344)
at nachos.userprog.UserKernel.exceptionHandler(UserKernel.java:231)
at nachos.userprog.UserKernel$1.run(UserKernel.java:178)
at nachos.machine.Processor$MipsException.handle(Processor.java:603)
at nachos.machine.Processor.run(Processor.java:101)
at nachos.userprog.UThread.runProgram(UThread.java:32)
at nachos.userprog.UThread.access$0(UThread.java:28)
at nachos.userprog.UThread$1.run(UThread.java:20)
at nachos.threads.KThread.runThread(KThread.java:160)
at nachos.threads.KThread.access$0(KThread.java:158)
at nachos.threads.KThread$1.run(KThread.java:151)
at nachos.machine.TCB.threadroot(TCB.java:204)
at nachos.machine.TCB.access$0(TCB.java:181)
at nachos.machine.TCB$1.run(TCB.java:71)
at java.lang.Thread.run(Thread.java:484)
****************************************************
Elaine May 11 12:31 AM
Test cases for tests 207,208, 406, 407 in nachos/test.
Could someone make sure the tests are sufficient?
****************************************************
Terence May 10 2:43 PM
i find a bug
so we need to finish up. tests throughtly the autograder doesn't
have. ask if amy need help
****************************************************
Elaine May 10 10:51 AM
Okay. We now pass 23/24 tests. I have submitted the same
code a couple of times to make sure the results are
consistent. The test we are not passing is "read queued
data after close."
I believe the problem was in lookAndGet in VMProcess,
I didn't return a te if I found a pageFault besides the
first page fault, so please update that too.
There is still a assertion error in VM, but increasing
physical pages to 32 should take care of that, not sure
why that appears.
****************************************************
Elaine May 9 9:30 PM
Sorry. A lot of my fixes are little hacks, so it is not
modular or neat. When I submit and it is trying to tar up the file,
it gives a segmentation fault, I am not sure why.
Sorry for not indenting.
I am also confused by apparently non-deterministic errors.
Emil says we should submit the same code a couple of times
and see if any fail sometimes.
The reason I put TLBMiss lookAndGet is because when we
call openFile.read(), it will sometimes go to another page
and then since the next entry is not in the table, it returns
invalid entry, and returns right away without reading the
rest of the data.
The reason I call handleRead first, is because I want to
return -1 right away if the connection is closed, otherwise
it tries to read some data. I wouldn't do that, but I can't
distinguish files from connections.
****************************************************
Terence May 9 7:10 PM
i noticed that elaine. all your submission's tarfile has 0 bytes. is
there anything wrong with your submission?
will be back aroun 12.. let's do big debug
we passed 21 tests.. sometimes....
let's get dis done !!! WE CAN DO IT! i will work on it very seriouysly!
****************************************************
Terence May 9 4:21 PM
why in the cleanupLoop became cleanUpSemaphore.P() again?
did it work without it? if so, i am very surprise. it passed all the
test.. may be that's why it did not passed the reliaability test
because the timeout thing doesn't work anymore.
don't forget to clean up the total thing after we are done debuging
elaine: please please please indent your code before you put it in
cvs. it gives a lot a lot of conflicts. there is a indent thing in the
emacs menu
elaine: good job on that finding seqno bug. let me fix it, k?
the method handlerRead.. should it be handleRead?
i fix seqNo
elaine: why did you put TLBMiss in lookatget? it is just not the right
place to put that k? track where that come from and put it there
instead.
... well, i don't care
anymore since we are not going to build anything on top of anything
anymore. but just a general advise. method is just supposed to do what
it is supposed to do. i expect you know from code compelte right?
elaine: but good job on finding that seqno bug though
so look at the stuff i mentioned about.. i didn't realy change it
except for the stuff in mailmessage
for you handleread. is it right to do if openfile.read(temp....) == -1
return -1
else super.handleRead? that sounds weird. because super.handleread
will call openfile.read again... really doubt...
****************************************************
Elaine May 9 2:53 PM
Nevermind, does not work for reliabilty < 1.0. Also,
there are random assertion errors when running test cases
like netTest15, netTest3, that come every once in a while.
****************************************************
Elaine May 9 2:00 PM
Okay, I think I fix the seqNo problem, someone please
check, just run selfTestSeq in NetKernel, with some
numbers. Also, it will now run netTest15 (bulk transfer
of 6000 bytes to completion) I will work on it more
tonight.
****************************************************
Elaine May 9 1:15 PM
I changed VMProcess.lookAndGet b/c there is was an
error when you try to read/write from more than one
page. Someone please check to see if it is right.
****************************************************
Elaine May 9 11:40 AM
I don't think that is the main problem. Actually it is
hard to tell right because
On further testing,
I discover that the seqNo are set wrong. There are sign
errors in setSeqNo, and getSeqNo so that after you go
past 127, the seqNo are screwed up! It starts going negative,
then postive after another 127, so that in receiveData
we continually drop packets 128-256 b/c they are negative
and under the receiveWindow...then once the numbers are positive
again, the packets are beyond the receive windows, so we
drop everything.
Could someone take a look at it? I will try to fix after
discussion.
****************************************************
Terence May 9 11:30 AM
good! micheal. let's talk about it tonight
****************************************************
Elaine May 9 9:05 AM
Processor.pagesize is 1024 bytes.
I will take a look at read/write.
****************************************************
Michael May 9 2:46 AM
How big is Processor.pagesize? The reason I'm asking
is because that's the size of the buffer that's passed
to write each time. I'm assuming it is going to be smaller
than 7000.
See if this makes sense to you guys. If I call write(7000)
handleWrite breaks it into 7000/Processor.pagesize writes.
Then connection.write breaks this down into
Processor.pagesize/packetpayloadsize number of packets. Except,
I really doubt they divide evenly so every single iteration of
Connection.write will have a packet at the end with missing stuff.
This is normally fine because the length is included as the first
byte of the payload.
However, if someone calls read(7000) at the other end, it's going to
expect 7000 seemingly consecutive bytes. But the way the window
shifts/calculates number of packets recieved, is by dividing
totalbytesrecieved by the number of bytes per packet. Because of these
packets which aren't completely filled, the window slowly shifts less
and less. Anyone see what I mean?
Another possible error is where it calculates current packet index
because it just takes poswhereleftoff and divides by payload size
except not all packets are garunteed to have a payload of the same size.
I'm actually oversimplifying because I'm looking at doCopy right now
and am not absolutely certain how endofpacket, restofpacket, and
poswhereweleftoff all relate to each other. If elaine or terence could
take a look it would be great.
****************************************************
Elaine May 9 12:56 AM
Very weird....I fail the first test, then when I submit
I include the swapfile and pass that test. This is so
non-deterministic!
I changed part of handleRead/Write in userprog, so please
update there too.
Still failing bulk transfer...not sure why.
The read after close and read byte at low reliability.
****************************************************
Elaine May 8 8:27 PM
I believe accept is ok now. I am not sure why we are
failing the test that says can't read after the connection
is closed, any ideas?
****************************************************
Terence May 8 6:15 PM
you got to copy to your own directory first. of course it is not going
to work if you untar in my directory. in the worse cases, copy it to
you computer, and use winzip to open it k?
i don't know what else i can tell you
could you produce a test cases that will have that situation: cannot
accept.... blah..
****************************************************
Elaine May 8 3:11 PM
This is what it says when i try tar -xvf
"tar: cannot open
/home/cc/cs162/sp02/class/cs162-nx/public_html/be_arch/.. Permission
denied..."
i will change the order in fd.setClose() in UserProcess to set the
fileDescriptor to false first before calling openFile.close()
It seems to work okay for all cases i run on 1.0 reliability, i
will test with less reliability...i am not sure why we keep
running out of time for the autograder.
I was testing simple connect/accept read/write. It does go pretty
slow for many processes trying to connect to one server.
Actually, can someone run netTest14 (only Network(0)) and see
if they get garbage? Try it a couple of times. I am trying
to close the fileDescriptor and still read from the not closed
end. Tell me if the behavior is correct.
It is like netTest13 (which works correctly) except in netTest13,
I wait until everything is written until I start reading.
****************************************************
Terence May 8 12:40 AM
don't undrestnad why you are the one who create the file (when you
download it) why would it say permission denied
again download the file
gunzip file.tar.gz
tar xvf file.tar
elaine, is there any tests cases that does not work?
hey guess what? just an idea when you are close, better remove that file
descriptor! so no one could read or write to it
****************************************************
Elaine May 8 12:19 PM
I tried to fix handleAccept....I found that the equals
function for Connection was not right. Could someone check
that I changed it correctly?
Terence: I download the tar file from the webpage, but when
I tried to un-tar? it says permission denied.
Or could you update your latest version...I fail connecting
to local host test, but yours passed.
****************************************************
Terence May 7 7:23 PM
no, that's not what i mean. if you don't sleep, doesn;t mean you need
to busy wait, right? as i said use the timeoutconnection class. you
cannot sleep in the receiver thread!
there are still some bugs in getackbitmap, setackbitmap, someone fix
it please!
****************************************************
Michael May 7 6:08 PM
Errr, close cannot start destroying things until all the messages are
sent. So are you saying instead of having it sleep/alarm, I should
have it busy wait? Sounds a lil inefficient, but just want to make
sure that's what you mean before I do it.
Unless someone else sees another solutoin, I'd be glad to hear it.
****************************************************
Terence May 7 5:15 PM
I read what micheal wrote about close. i looks good, BUT PLEASE PLEASE
do not sleep or use alarm because this is the receiving thread. if you
sleep, it will not get any messages.
the reason why i have the timeoutconnection module is to do such
thing. it 'post' an event and some thread will pick it up after
certainly period of time
more bugs in mailmessage, bitmap.. it involved in extending
****************************************************
Amy May 7 5:05 PM
terence: well i guess it is the same idea...the prob was it didn't
really catch that null pointer bug b/c u r only using 1 link address
(0)...and i think elaine said she fixed it...anyways...will work on it
later tonight
****************************************************
Terence May 6 12:40 PM
I put an lock around the sendData stuff
if i get an ack of seqno 2, remove everything before it
why is cpDataFromMessage (at the end, it has the amound you copied)
when is the cases that we need a offset from the mailmessage
elaine: could you either explaine why you need that extra offset at
the end or remove that line. if you have a reasonable explanation,
remove my line in read.
amy: i shouldn't make much different whether using one nachos w/ 2
process or two nachos w/ 1 process.
so how is close? I will work on this unil four and i have to do
something else..
Please do update before you commit!
why do you need a seqno in a stp?
i rewrite mailmessage decode for dataack. it did not work at all for
the original version
****************************************************
Elaine May 6 11:05 AM
Was the null pointer in handleDataAck or handlePacket? If so,
I think I fixed it, it was in Connection class. I notice that
for sendStp, we do not put a seqNo in, I will add that.
****************************************************
Amy May 6 7:57 AM
terence: remember i was saying there's a null pointer, that is b/c i
was actually running two different nachos (so 2 netkernels), and u r
running 1 nachos w/ 2 processes, so u r not really doing remote
connect...(seems like the sender has sent a dataack after it sends data)
****************************************************
Elaine May 6 8:36 PM
I don't see what is wrong with the old handleClose. It
does the same thing as the new one, someone correct me
if I'm wrong. In the new handleClose, freeFD needs to be
updated along with openFileNames array, but that is
already done in original handleClose.
****************************************************
Michael May 6 7:55 PM
Looked at read again, made one small change.
Made some large changes to stop/finish module, I'm not sure
how "safe" these changes are.
First, I wrote a new handleClose in NetProcess.java because I
do'nt think the old one will work in this case. Then I took
the handleClose in NetKernel and copied most of it into
Connection.close and threw in a few lines to destroy + free port.
Elaine if you could take a look to see what other resources can be
freed it would be great.
****************************************************
Elaine May 6 5:54 PM
Yeah...it should just be currentPacketIndex for doCopy,
and cpData should be amtCopied...sorry, i was confused.
I just notice we don't ever use the offset passed into
read. I will fix that after finish debugging read for
offset of 0.
****************************************************
Ternce May 6 3:52 PM
why is in docopy
the endofpacket is = max(0, currentPacketIndex - 1)
why is that? why not just currentPacketIndex?
explain....
elaine: why is in cpData the offset is posWhereWeLeft off shouldn't be
amtCopied? i put your test cases at the end, because it gave me
compile errors... i see you in lecture
micheal: the makeData, shouldn't set payload directly like that?
because again data abstraction. all the formate stuff should be in
mailmessage.... let me change it for you I correct something so could
you read 'write' again?
hey, if write return -1, do the message will get send or not. just curious
****************************************************
Amy May 6 11:15 AM
oops sorry, i mean it is an instance vars, but shouldn't pass it in
function as a parameter.
****************************************************
Terence May 6 10:49 AM
yup yup yup. the code is very very buggy now because i didn't work on
it yesterday (typing my paper) i am sure there will be inconsistency
the way we write the code, duplicate funcation, false assumption and
all those stuff....
i agree with amy's poswhereweleftoff should not be global. it should be
a instance variable withing connectoin.
elaine: could you please mentally run througth read couples of times?
also, did you fixed, handleconnect, handleaccept?
micheal and amy: i didn't look at close module in depth yet,
so.... BUT is it possilbe to leave close/fin module to you? or you
want me to work with you... i just want to make sure datatransmission
work first, then i work on next one
i will work on testing/debugging datatransmission tonight....
oh also, when people say 'fix something' i am sometimes confused with
what they actually fixed (that apply to me too) so yeah it would be
nice to be specific
****************************************************
Michael May 6 9:26 AM
Fixed seqNo again. Yeah, I keep changing it =T. My latest version is
now in network.
Terence: The reason I asked was I want to know is it possible for one
side to do a write(100) and the other side to do a close() just a tiny
bit afterwards and when close check send buffer the send buffer is
empty? We can discuss this after the meeting.
****************************************************
Elaine May 6 9:23 AM
Thanks for the debugging, Amy. I will work on fixing
read now.
****************************************************
Amy May 6 7:05 AM
some things i see when i read (sorry not finish reading yet):
handleSyn didn't take care of the case connection.CLOSED?
minor access thingie: Connection constructor should be private b/c u
don't want anyone to make a new Connection w/o calling
makeNewConnection.
shiftWindow, right here u try to pass in the global variable
posWhereWeLeftOff, i guess that's ok...but not necessary, but the prob
happen at the end, when u try to decrement posWhereWeLeftOff, it is
trying to decrement the local variable, not the global one
i didn't want to cvs commit anything b/c i m not sure if anyone of u
have modified stuff and the one on cvs may not be the newest version.
****************************************************
Terence May 6 4:10 AM
all you question is kinda related to close/fin module which i did not
pay a lot of attention to yet... to give you a temporary answer...
for your first question
i do think both side need to do close.
for your second question about the sendbuffer
isn't when sendbuffer is empty == all data sent?
i am not talking about the global send buffer.
don't forget the meeting tomorrow at 12:30
****************************************************
Michael May 6 12:30 AM
I think Elaine did a preliminary test of Connect/Accept.
A couple things: Can both sides close? or only the side
that sends? Also, is it sufficient to just check if the sendbuffer
is empty? or is it possible that the send buffer be empty
yet not all data sent yet?
****************************************************
Terence May 5 11:30 PM
Micheal: equals method is for method contains (of some class, don't
remember) there are a lot of java build in class that use a Object
equals method which is not what we want... so.... hashtable,
linkedlist... blah...
close is not equals to handleclose. go to userprocess. you will see
openfile.close is used. basically, it should destory the connection
(already implemented) and.... free the port (if
possible).. errr... check each invariants. just maintain each
variable's invariant... if you know what i mean...
elaine and micheal: good job!
does anyone try to test it?
i am very sure there are a lot of bugs
****************************************************
Elaine May 5 6:15 PM
version able to compile in network. Write needs to
take care of seqNo. Fixed read, shiftWindow to
return int.
****************************************************
Michael May 5 3:00 PM
Elaine: Read, Shiftwindow, and DoCopy didn't return anything althought
they're supposed to return an int. For now I've temporarily had them
replace 0.
Terence: I'm not exactly sure what you're trying to do with your
equals method so I've left it alone for now.
Close isn't done yet, and I'm not sure what it is supposed to do. I
don't think Close equals handleClose or how they're related. If
someone can clarify these for me, I can start working on close.
****************************************************
Elaine May 5 2:14 PM
Yes, all the files are in network. The basic structure
is in place and it needs to be compiled and tested,
and debugged.
****************************************************
Amy May 5 1:34 PM
ok what is going on right now? i m almost done w/ my essay...so i will
be coming back to join u guys/gals soon. sorry about being away for
so long. can anyone tell me what has been done now? all the files in
nachos/network right?
****************************************************
Amy May 4 4:19 PM
kat: if u can, can u code in chatserver.c when u r done w/ chat.c? it
shouldn't be that bad for u since all the pdl are there....
well just try....i will work w/ u after i m done w/ this
essay...hopefully very soon....sorry about that...
****************************************************
Amy May 4 10:09 AM
kat: i write 2 functions for u, movearray.c and readlinefrombuffer.c,
they should be helpful for u, they are in nachos/test also, and
remember to add these 2 files in the LIB in ur Makefile
anything just ask me ok?
also can other people add that in too? thanks!
****************************************************
Amy May 4 8:41 AM
huh??? kat, what do u mean? it is in nachos/test....cvs update in that
directory?
****************************************************
Katherine May 4 6:47am
Amy, did you write the comments in chatserver.c and chat.c?
****************************************************
Terence May 4 6:30 AM
Me and elaine worked on cleanup and data transmission module. it is
not done yet. elaine, i got a question for you for handleaccept if
something give you a negative port. will the number of free FD will
be the same after? i am not sure what getFreeFD does. could you make
sure if connect and accept fail the number of fd, free port,
connections will be the same as before? i don't think it is doing that
right, right now. so...
micheal
i think it does not really make sense to wait until saturday which i
expect mailmessage to be done by thursday because everything depend
on it. it is not even possible to compile without that. so are we
going to wait for extra two days for not doing anything?
for proj3, part3, i did looked at what you have in junk backthen. it
was not until amy told me so. it will be better if you write in the
log that there is such a thing. also, it was far from
complete. everything is like 'yah, not sure how to do this yet' so
should we wait for you and do nothing? if that was so, we would not
for sure finished proj3 on time. when something need to be done, it
need to be done otherwise, that person are dragging down the rest of
the group. same thing for connection class, i asked you to do
connection and i think there was only a constructor and nothing
else. i am not sure if i give you enough direction or i expect too
much from you. Also, i could count how many times you post on the
log. if you have nothing to do, you could at least concern or ask
right? instaed of coming back after the project is done and ask to
help in the final design doc.
okay, enough of this... you know what to do from now on. people would
just know how much a person contribute without telling everyone that
he/she did someting.
back to part 1,
the send, recieve, connect, cleanup, connection module are done. there
are more left. that is closefinish, datatransmission module and that
is quite a big part of the project too. so if someone want to
contribute, this will be a good starting point. it is strongly
recommend to read what is already there, since there are almost 1000
lines there already and the design doc. if you have questions, ask! it
is better to ask than write something that does not work. me and
elaine are working on datatranmission. we could finish coding that
part by tomorrow.
the files are in junk
there are no need to test mailmessage because it is just way too simple
I will be online at ten tomorrow
****************************************************
Michael May 4 12:36 AM
I've been a major flake, especially on this phase and
the first phase. I know everyone has a heavy load and I don't intend
on screwing anyone. I am indeed ashamed of having you guys do more
work than me. I don't have any excuse such as not reading the logs as
I have been monitoring them quite closely since the second phase, and
I've put all other stuff ahead of this one. I thank Elaine for writing
mailmessage on Thursday while I was gone, and Terence for cleaning it
up. My last point though, I would ask you all read and think a little
before responding. I don't intend to offend anyone, but I believe we
do have a communication issue to work out.
1st, Terence sorry for falling asleep on Wednesday, I believe I was up
until around 2-3 before I nodded off. I know you said lets get 2
things done, but apparently I didn't even do 1. My lack of efficiency
that night was unexcusable.
2nd, This is the touchy one. I believe we may have communication
issues here. The understanding I got from Terrence was that MailMessage
needed to be done by Saturday, yet when I looked this morning (Friday)
it looked done. Terence, I have no problem when you say you aren't
going to help because you've given me the easiest part. I actually
appreciate you giving me an easier part as well as trusting me to
complete it. However, I'm at a loss when I log in and find that you
and elaine have completed it. I had every intention of working on it
Friday as I was using Thursday to run errands.
I guess the point I'm trying to make is that I could use a
little help knowing exactly when you want it done. Of course
everything always needs to be done ASAP, but when I see a deadline, I
view it as the amount of time to complete a task. Also, I don't
intend on ever using the group as a crutch. I am indeed still ashamed
of the lack of work I did during phase one and during the group
evaluation that's turned in after, I told the TA the situation and how
the entire group was carrying my weight in the
extra comments. I don't ever intend on repeating that. While I also
have little work to show for phase 3, I did have some code in junk and
I actually spent a lot of hours doing non-productive stuff simply
staring at the code not knowing how I should fill in. This was my
fault completely and I should have asked on the log as to where I
could contribute. As of right now, I don't know what is left. I talked
to Elaine and she said quite a bit and to ask Terence, but he wasn't
on AIM. It's now 1am and I guess I spent 30 minutes writing this, but
I'm going to sleep now. I'll be up at 10am tomorrow and I will start
by testing MailMessage, if anyone has something more constructive for
me to do, I will gladly do that instead. Also, if you do think it's
necessary for me to work now, go ahead and call. I'll get up and
work. My number is on the contact and everyone is welcome to call at
any time. Once again, I am sorry. I don't intend on letting anyone
down this time.
****************************************************
Amy May 3 10:36 PM
agree terence! that's what happen to me in proj1, i was spending time
to debug the elevator the day before 2 mts, which costs me to have 2
bad mt grades (162, 188), this is not funny at all. not to mention about
studying for 170!... hey everyone has a heavy load...if u can't handle
it should have said it earlier..like dropping the class....don't screw
us up! it is not fair for those who spent sooooo much time in the
proj..and get a bad grade b/c no time to study for finalz, while those
who doesn't spend any time in the proj, to get a good proj grade and a
good final grade, b/c they have sooo much time to study for it! i
mean u should feel bad too, think about how u get the grade, by
sacrificing other people's grades! what a shame! and i don't like to
hear any explanation like 'oops, forgot to read the log..i didn't know
i have to do this', this is just sh*t! everything is in the log, if u
dunno ur duty, that's totally ur fault! and not only not deserving ur
names on the project...i think we should talk to the ta or maybe prof
about it. cuz it is totally not fair!
****************************************************
Terence May 3 10:17 PM
To everyone in our group,
Due to the shortcoming of final exams, we will NOT tolerate any
slacker(s) in this part of the project. If one did not spend
sufficient time on the project or have noticably poor efficiency, we
will NOT put his/her/their name(s) in the design doc and final
submission in the project not to mention an zero in the group
evaluation form. This may sound harsh, but this is totally not fair
for other group members who devote most of their time on it. this is
the last phase of the project and there should be NO excuses, like
'forget about the project' or 'have other stuff to do' or other lame
unreasonable excuses.
****************************************************
Amy May 3 12:32 PM
submitted design doc
****************************************************
Terence May 3 10:21AM
could you try to compile your code? turn on aim
I am actually fixing your code.... go on aim
****************************************************
Elaine May 3 10:06 AM
Please ignore what I said before about toBytes. I was
using the wrong method. New version of MailMessage now
available in proj4/junk. Enjoy!
****************************************************
Elaine May 3 12:37 AM
Thanks for all your help Terence!! I will clean up
the code tomorrow. Tell me what else I should do.
Preliminary version of MailMessage in proj4/junk in CVS
The toBytes function in the Accessor section currently
converts integers to 4 byte numbers. This might have to
changed later.
****************************************************
Terence May 2 7:11 PM
donn't forget the design issue for design doc.. i am done with my part!
****************************************************
Katherine May 2 5:20pm
Yeah, i'll write the test cases tonight.
*****************************************************
Amy May 2 4:15 PM
kat: i put part 2 in design-draft.txt (in design directory)
can u take a look at it? and tell me if there's anything wrong w/
it..and also i didn't write test cases, u can help me write that ok?
*****************************************************
Terence May 2 12:28 PM
Thousands thanks in advance, elaine!
elaine, i put up all the skeleton for you. with all the pdl,
signature... so please please write clean code!
Micheal, your task is way too simple, so i expect you to have high
quality, completed code.. so i am not going to help you
I rewrote connection
the files are in junk
*****************************************************
Elaine May 2 11:03 AM
Understood. Will start coding after Discussion at 5.
*****************************************************
Terence May 2 7:15 AM
i completely rewrote it because some part does not make much sense. i
looked at the autograder test cases and it seems extremely harsh. so
could we start today?
elaine could you do the connect module part? since you are familar with file
descriptor and your own code. please write clean code k? it should not
be that hard, because i say what exactly what to be done in the design
doc. you can use what i wrote in the design doc as pdl. so...
micheal could you do mailMessage module? that's the EASIEST thing of part 1,
so could you make hundread percent sure it is done?
i will do the Send and Receive Module.
Let's get this done by today, k?
My goal is to get the code done by saturday. I am sure i cannot
accomplish this by saturday. we need to work in parallel!
*****************************************************
Terence May 1 4:45 AM
Me and Michael put up part1b in design directory. it is partially
done. i need to ask TA one question and i think i will know what exactly
is going on.
*****************************************************
Amy April 30 4:34 PM
Design Doc Review Mon 12:30 PM
*****************************************************
Terence April 27 1:00 AM
for the design, let's have something by tuesday. done with design on
wednesday, talk to emil on wednesday, start coding on thursday.
check this out
i got ten page paper to do, 61b two projects and two home work to
grade, do 170 homework, 4 162 lectures to watch, reseach project to
finish up... so don't complain. you are not the only one who is busy
micheal, elaine. let's meet together onine on monday, be prepared. we
try to get it done by tuesday.
I just want myself to have more time, just in cases it end up as bad
as proj3
*****************************************************
Amy April 26 9:45 PM
i submitted the design doc, if u find anything wrong w/ it...let me
know we still have time to change it since we used a slip day..it is
officially due on sat.
Proj 4 division:
Part 1:
Terence, Elaine, Michael
Part 2:
Katherine, Amy
some kind of deadline to talk about it?
*****************************************************
Amy April 26 8:28 PM
i think the prob i m thinking of is let say process 1 tries to get a
page to do read/write, so it disabled the interrupt and it called
handlepagefault, but no page is available...then what happen is it
will yield (but still in disable interrupt mode) to let say process 2,
and process 2 will work on swap, and etc without any interrupt...so it
will not really have parallelism...(i m not so sure if this will be a
big big prob...i can't think of the real example that it will get
stuck yet...) maybe i m worrying too much?
*****************************************************
Elaine April 26 8:28 PM
Terence: You can beat me up the next time you see me...
I will also return your book.
Amy: I think it the design doc is fine. Calls to rwHelper
are much less than calls to handleTLBMiss when I counted them
since only syscalls will make read/writes to memory...I
think this is right, so since there are no interrupts disabled
in handleTLBMiss which is the majority of the cause of page
faults, it will not slow down the program much...should I
add that in? Does anyone think what I said is valid?
The only additional suggestion I would make is to attach the
original design document below the current one b/c Emil wants
it, even though the design has changed drastically.
*****************************************************
Amy April 5:52 PM
looking at design doc, i was wondering if we have probs with excessive
read/write memory b/c of the disable interrupt thingie? b/c only 1
process can do read/write at one time...well it is only a thought
right here...haven't really thought it out...
and also i added a little into design doc...Implementation of lock
vs. interrupt...can u guys/gals take a look at that...tell me if u see
anything wrong w/ it....before i submit...
*****************************************************
Terence April 12:35 PM
forget it..
download the file
it is gunzip the file
tar xvf the file
cp files to your directory
i could beat you for real? ok i am going to get strings
and stuff. when do you want me to beat you up? today?
your book is here!
*****************************************************
Elaine April 12:14 PM
Help. I can't decompress the file, I tried typing
gzcat proj3-code.0426-1135.tar.gz | tar xf -
but it says permission denied.
Could someone tell me the correct command to unzip
it?
I did copy the files from your home directory and
it's working fine for all test cases I ran.
TERENCE AND AMY: I AM VERY SORRY FOR POSTING ALL THE CRAP BELOW!!!!!
I was looking at an old, incorrect version of the
files. You may beat me now.
I misunderstood the post on 2 physical pages, I thought
they meant that we have to fix that and have one process
run to completion and then switch to another process and
give that process the 2 pages to run.
Terence: I will go read Code Complete now, see everyone
at 9 tonight
*****************************************************
Amy April 26 12:04 PM
ok...will look at design doc
going to be online at 9..so just im me
*****************************************************
Terence April 26 11:51 AM
did i say use the version online?
i don't think i gave amy the final final versiton either. so ...
there must be more than two physical page for two process? so if two
pages doesnot work, it is NORMAL! sigh...
i read the post, he said you can do whatever to make the code
work. but disable interrupt is not necessary if there is a better way
to do it
first of all, he is saying about the TLB handler!! ok it is TLB
handler, not rwhelper, not syscall.
secondly, is there a better way elaine? say what is in your mind to
implement the rwhelper
again
rather than saying there is a problem, it is better to have a solution.
if it is better, you can change it
*****************************************************
Elaine April 26 11:39 AM
Thanks! Amy's version doesn't work.
*****************************************************
Terence April 26 11:27 AM
cvs vm directory get fuck up
it is in the result.html
so if you want to check out the code. go there. i hope you remember
the password
*****************************************************
Elaine April 26 11:19 AM
Terence, could you pleae commit the version that passed
all the tests b/c Amy commit her version, and it's not
working! For testJoin, 2 pages, it keeps swapping, 4
pages has Assertion error. I'm sure i have the wrong
version now.
*****************************************************
Elaine April 26 9:33 AM
Also, where is the final version of the project? Is
it in vm? I think I don't have it b/c the dbflags are
not turned off in my version....maybe I am not even
looking at the right code....could someone with the
correct version commit it again? And the correct
UserProcess and UserKernel too, please?
Of course, I don't want to change the code, I just want
to make sure it works....since the next project inherits
from proj3, but we have 16 pages so I just want
to make sure everything works using 16 physical pages.
I believe I am using a wrong
version or something b/c simple test cases are not
working for me...If this is the case, I apologize, and
please ignore all the stuff I said below. (and you may
beat me too if you want)
I mean that testJoin didn't work for me even with 16 physical
pages....my files must be wrong.
I am very sorry...it is not violation of design specs, but
we might lose points on the design document, unless we
say why we choose to it this way. (of course, we have a
different TA and I don't think Emil will care)
From Andrew in post "Locking TLB Handler"
*****************************************************
Terence April 26 9:30 AM
i edit part 1 and part 3 in the design doc, it is in design
directory. amy, could you spend five minutes to look if it is ok and
submit it? sorry.. i know you need to do you 188..
also, could we talked about it online to divide the work for next
project. it is due next thursday.
tonight 9:00 PM online?
this time i want to start as early as we can. i have paper and stuff
to do too... k? we could go to emil office hour on next
wednesday. talked about it and start coding on wednesday
*****************************************************
Terence April 26 8:14 AM
elaine. yes amy and i have been talked about it for a very long time.
could you give me any reason why you cannot? except of classmate or
newsgroup? if you can, please give us a counter example. it is not
violation to specs either.
userprocess vmprocess is not running in userland. you are in
Kernel. everything you write is in KERNEL. that's why there is a thing
called trap from processor to KERNEL! for handlepagefault. that's why
lyou called KERNEL for syscall. we did not disable interrupt in TLB, k?
how is that violate breaking data abstarction? where is the data?
how is that defeat parallism? test 10 proved this. the whole point is
to context switch while we are doing io. could you please read the
code again about usingFreeMem?
elaine, if you really want to change the code, go ahead... if you have
a better implementation. i will consider about it, but i am already
sick of the project.
yes, i do agreed that it is kinda cheap to do disable/enabled
interrupt. but we do it for a reason. we had lock before. but that
won't work because of our current implementatioln. using interrupt
will be a quick solution. ask yourself what happend if we use a lock?
in rwhelper. that is the main functions which make us think it is
wrong to use a lock.
that's all i am saying, it is up to you to use the slip days.
instead of saying somehting is wrong. it is better to say some other
way is better.
i don't want to be mean, but i am just too tired andl sick of this
after working on this shit for so long.. sorry for my grumppiness.
let me explain so more, before you reply.
public int rwHelper(int vaddr, byte[] data, int offsetForData, int lengthToCopy, boolean writing) {
// get the vpn
int vpn = vmKernel.pageFromAddress(vaddr);
TranslationEntry currentIPTE = vmKernel.getIPTEntry(pid, vpn);
// if not valid
if(!vmKernel.isEntryValid(vmKernel.getIPTEntry(pid, vpn))) {
// either not lazy load, or on disk
db2("page fault on rw virtual memory vpn " + vpn);
handlePageFault(vpn);
}
vmKernel.setEntryDirty(currentIPTE, true);
// call super
int ansOnStack = super.rwHelper(vaddr, data, offsetForData, lengthToCopy, writing);
// context switch fine after this point
return ansOnStack;
}
here is our original rwhelper function
now we got to implement pinpage
handlepagfault is going to called obtainfreemem which is going to pin
the page. but you had to have certain kiind of lock around entry valid
and super.rwhelper. otherwise it will context switch right after
handlepagefault. that lock needless to say have to be something else
other than the lock in pagefault. otherwise there will be a deadlock
so let's say we are using lock for all our implemenation.
process 1 have a syscall called rwhelper, acquire rwhelperlock, went
to page fault, acquire pinlock. release pinlock after
handlpagefault. another process come grap the ppn that it not suppose
to get, do stuff with it. process 1 continue, don't know somebody is
messaround with it, continue the read. you see a problem?
what is your solution elaine?
*****************************************************
Amy April 26 8:04 AM
terence and i have talked about this for a long time...we really
cannot find a way to do it w/ lock around rwHelper (and making the
code even)...
and we r not really accessing tlb when diable interrupt right?
just handling it when there is a tlb miss...
*****************************************************
Elaine Apirl 26 8:00 AM
Actually, now I think the implementation is wrong b/c
I see a post "Locking in the TLB Handler"
that someone asks
"Why can't interrupts be disable during all TLB
accesses?"
Replyfrom Dan Hettena:
"Well UserProcess and VMProcess are running in
userland. You're not allowed to disable interrupts in
userland, right?"
I think we really need to change the code then, what does everybody
else think?
*****************************************************
Elaine April 26 7:44 AM
Sorry for the long post.
I don't know if this is right, but I am worried
that we disable and enable interrupts in the code. Is that
not breaking a data abstraction? And also, does it defeat
parallism b/c no other processes can run when we have a
r/w pagefault? Actually I am a little confused on why
interrupts are disabled in rwHelper, but not in
handleTLBMiss since this transfers memory to/from disk
also. Can someone explain?
Some classmates told me that we cannot do this and I am
worried that we might lose points for it b/c of reasons
stated above. It's easy to search and see if we disble
interrupts.
Does anyone want me to post to the newsgroup? Or ask a
TA? Did anyone see anything on the newsgroup regarding
this?
If everyone thinks it's ok, then I will not do
anything, and apologize for wasting your time. I just
didn't want points lost b/c Amy and Terence worked so
hard on the code, it would be disappointing to lose
points for something that we didn't know about...or
were assumed to know about.
*****************************************************
Amy April 26 7:21 AM
i put part 3 in the junk, w/ slight changes.
hmm...i m not so sure about the test cases for that...
remember to do ur group evaluation also!
and who is submitting?
*****************************************************
Elaine April 26 12:07 AM
Part 1 is up in junk
*****************************************************
Michael April 25 9:46 PM
I've put a fairly final version of part 3 in proj3-draft
under the design directory.
Personally I think it's all complete except for the test
case section. If anyone is willing to pitch in there, I'd
really appreciate it. Also, I'm fairly certain there's only
one variable, (well one important one) if anyone disagrees, let
me know and I will change it.
*****************************************************
Terence April 25 7:44 PM
i don't really wanna lose stupid points on design doc.
please have these part in design doc
1) variables
describe each variables
2) methods
please make sure you describe all the functions that is related to
your part.
HIGH level design on each functions. no pesudo code pleas
3) design decision
speed, space, simplicity, modularity, data structure
4) tests cases
describe test cases in DETAIL
*****************************************************
Elaine April 24 1:44 PM
Hey Amy! I'm thinking about unpinning pages on the
next PageFault by the same process so the process has
a chance to use the page...what do you think?
There is a swapping problem for 2 pages like
*****************************************************
Elaine April 24 11:23 AM
Will work more after class at 1. Please ignore the
version i put in vm...it is not correctly implemented
There are synchronization errors. I overrided mine
by committing yours again.
*****************************************************
Amy April 24 6:24 AM
terence, compiled version in vm (vmprocess and vmkernel) didn't debug
yet....
i will work w/ u when i get home today, around 2.
good luck on ur mt!
*****************************************************
Amy April 24 10:50 PM
terence, and anyone who cares:
hmm...not going too well...this is more complicated than i thought..
anyways..let me explain what i did:
(in vmkernel)
i have a data structure pinPages, which holds a idPinCombo for each
entry. so each entry holds who pinned, and what is the pinstatus.
(in vmprocess)
i acquired a lock when a process called obtainfreeppn the reason i do
this is so that no 2 processes will grant the same ppns and screwed up
everything..so then when a page is found we will pin this page..and
put the current pid there..lock releases.if we cannot find any unpin
page, we will yield...
but not where to unpin page is a little complicated...(so the code
seems to be kinda uneven, like i can't really have something like:
function(){
pin();
...
unpin();
}
we need to unpin in another function...which is not too good...but i
can't think of any other ways right now...
then now is the prob in locking and pin/unpinning in rwhelper...(we
need to lock it so that if context switch our page will not be
overwritten by another process.....still thinking about this...and
where to unpin stuff..
if u really want to look at what i have so far, it is in
nachos/proj3/junk...
but do ur stuff first..
another thing, in rwHelper, y do we have a line:
setEntryDirty(...true)??
*****************************************************
Amy April 24 7:57 PM
answers to terence's questions:
i don't think elaine is going to implement it using monitors, cuz it
gets really complicate..so don't worry about it...
and about the replacement policy, elaine told me that it was about
clock/nth chance algorithm...not the sequiential one we have...
i think we r going to use array where each entry will hold pid & pin,
but right now thinking about where i should lock stuff so that it
won't break...
well start my implementation right now..
*****************************************************
Elaine April 24 4:19 PM
sorry...i posted the wrong info about the monitor...
here is the corrected stuff:
Ex. Condition waitingThreads(pageLock)
Lock pageLock
some function like obtainPages
pageLock.acquire();
while (true){
if (numUnpinPages > 0){
findUnpinnedPage()
pinPage()
numUnPinPages -= 1;
break;
}
else
waitingThread.sleep()
}
pageLock.release()
Wake the sleepers whereever you increase numUnpinPages
*****************************************************
Katherine Apr 24 3:55 pm
the replacement policies for the page table are
working now. I'll cvs commit them after we fix the pin.
******************************************************
Amy April 24 2:57 PM
hmm...i m going to work on it when i get home...
i think a list of locks will get really complicated...b/c we need to
set up a waitingqueue that are waiting for n number of locks...i
dunno...maybe i m thinking too complicated?! anyways...let me think
about this some more...
and what if we don't use a list of locks then, how should we put
process to sleep? we don't do currentThread.sleep()/.yield(), do we?
i m still not totally sure where we should pin/unpin pages around
rwHelper for sure...but then what else, we should not pin it in loading
section right? cuz when will we unpin the pages after u load?
elaine said something about handletlbmiss, we should
pin/unpin...explain more...y??
******************************************************
Elaine April 24 12:43 PM
Do people think then that we should have a linkedlist
of pinned pages or somthing else like that? Instead of
adding an entry to idVpnCombo?
I was implementing it the other way...but i will stop...
it is still in junk if anyone wants to look...i put
the monitor in VMKernel.getRandomPageToReplace
but it doesn't run...will work on it after classex
Amy: if you want to put threads to sleep while waiting
for a lock...you can use Monitor with Condition variables.
Ex. Condition waitingThreads(pageLock)
Lock pageLock
some function like obtainPages
pageLock.acquire();
while (true){
if (numUnpinPages > 0){
findUnpinnedPage()
pinPage()
numUnPinPages -= 1;
waitingThread.wake();
break;
}
else
waitingThread.sleep()
}
pageLock.release()
Then, like you said, unpin the pages in handleTLBMiss and in
rwHelper.
******************************************************
Terence April 24 12:19 PM
elaine: I don't think that's the correct way to do so. put it in idvpncombo?
data abstraction violation!
amy: i mean acquire the same lock two times (same ppn). like acquire in obtain
freeppn and swapoutoldpage. you have acquire the same lock two
times. this would need to dead lock. we have exactly the same idea.
I will leave it to you, k? worse case we could use slip days
******************************************************
Elaine April 24 8:59 AM
I believe that is the correct way to do it. As others have
done...there is an extra entry in the coremap that indicates
whether a page is pinned or not. In our implementation, we
would just need to add a boolean variable to class ipnVpnCombo
*******************************************************
Amy April 24 7:56 AM
you said don't acquire more than 2 times, isn't it highly inefficient
if numPhyPages >> numofprocesses, b/c processes have to wait just b/c
they have acquired 2 pages...
i m thinking it will be ok if we have an array (size = numPhysPages)
where each entry will be a boolean indicating if the page is pin or
not...so we just unpin/pin the pages around those functions, if we
can't find a page then we just sleep, what do u guys think?
*******************************************************
Terence April 24 6:30 AM
so we got to do the synchornize part. we are having a global lock
around everyting. when we acquire something, it do a year long of
io. no one can interrupt us, then we relase it. this is heck of
slow. it is best to do everything in parallel. like swapping page in
paralllel
i read up something call pin
page it is basically obtain a lock and release a lock
i don't have time to do it. i left it to you guys to figure that out.
so you have a vector/hashtable of locks for each ppn.
pin = acquire on if input ppn match
unpin = release
make sure you don't acquire two times otherwise you deadlock yourself
pin and unpin around:
1) wherever you called
tranferPageFromDisk
tranfterPage2Disk
2) section.loadpage
3) super.rwhelper
basically anything that touch raw memory natively
for now implement it as list of locks, if it works. make pin using a
list of queue. (copy the code from lock) and implement a lock
indirectly. maybe you can find a better idea. then this.
this is just my thougths. if a ta said different thing, ignore this.
*******************************************************
Michael April 24 2:47 AM
My bad, it was my post. Elaine couldn't see the newsgroup/
see what was posted so I put it here for her. It wasn't
100% clear to me if we were using 2 or 3 at the time.
Thanks for verifying it.
*******************************************************
Terence April 24 2:21 AM
just wondering who post this....
we are using implementation 3
*******************************************************
This is Terrence's original post:
from what i heard from a TA. there are three implementations of swap
disk mechanism that possible.
1) swap disk is just another level of physical memeory. for anything
exist in physical memory, it must exist in swap.
2) swap and physical page is exclusive (space efficient)
3) swap and physical page is exclusive. but you can leave the page in
swap while it's in memory, so we can take advantage of paging it out
without doing any IO because it is already there (most time and space
efficient)
our implementation is the second one. so if we just replace it,
(without putting it back to swap and just kick it out) are we going
just going to lazy load it again? (from coff? and allocate a new one
for stack?)
does what you say apply to all three implementations?
tdawg
This is Jeryl's response:
What I said about picking a page to replace applies to any
of these implementations. A dirty page can be chosen for
replacement just like any other page (except you might want
to let dirty pages stick around longer if you can).
What I said about writing *only* dirty pages to the swap
file applies to (1) and (3). If you are doing (2) (I didn't
know (2) was allowed because it's very time inefficient,
and may not meet the spec) then yes you'll have to
write even non-dirty page to the swap file if you're replacing
it. But when I posted, I assumed everyone was doing (3).
-Jeryl C.
*******************************************************
Katherine Apr 23 6:15pm
i added a line in saveTLB(int processId) in VMKernel.java
after setEntryDirty(iptTE, true);
i added
setEntryUsed(iptTE, true);
that seems to fix the testJoin.coff, testJoin2.coff problems.
also, for both testJoin.coff and testJoin2.coff, it seems
like the first line of printf("ABCDEFGHIJK") is not printing
out the whole sentence inside printf(). for example, it's only
printing ABC but not DEFGHIJK.
I'm not sure why, but that only happens to the first call to
printf(".....") in a function.
Also, in testJoin.c, i commented out the line
printf("about to join"); (located in teh middle of the function)
because it won't run otherwise....
strange though
so i guess the testJoinx.coff's work now....
********************************************************
Terence April 23 5:46 PM
me and amy going to debug tonight around ten. we are going to have a
chat room.. so if anyone want to join, please do!! i got midterm on
thursday, so maybe today is my last day of working on nachos.. sorry..
********************************************************
Terence April 23 15:15 PM
elaine and all: i got a task for you to do
i know there there is an errror about dirty bit and stuff. the reason
why we get null pointer exception on isEntryValid, the reason why we
get unhandled exception, the rason why it print garbage is all becuase
of this error..... i sat around for too long, but cannot find it. it
will be nice if someone could do that...
it is something like either we just throw away written pages (not
putting on the disk) incorrectly, or either we write the wrong stuff
to physical memory. that's why it is junk.
to test it easier, set numphyspage to 2, do echo.coff. and in
processor translate functioin add if (writing) sys.out.print("dirty
bit of tlb of vpn " + vpn)...
********************************************************
Elaine April 23 11:55 AM
Seems to work fine for big read/write and multiple-joins.
Passed 9/10 tests. I'm assuming we run out of time on the
last test b/c of the page replacement policy which is right
now, replace any page sequentially. If that is fixed, I think
we will pass the test.
There is a little problem with running in the background...it
does not print out exit code.
********************************************************
Elaine April 23 9:25 AM
I will do more testing.
********************************************************
Terence April 23 5:00 AM
fix growswap file bug
fix read invalid vm, it can be on disk too!
something wrong with the dirty bit.. if i just copy everyback to disk,
this will work.
********************************************************
Terence April 22 12:54 PM
Elaine. are you talking about the stuff in vm directory or in junk
directory. i am kind of confused with what you say
elaine: what exactly did you fix in clearTLB? is writeTLBEntry necessary?
please do
prompt> ln -s ~cs162-nx/public_html/log.html ~/where/to/put/it/log.html
you know what to do with /where/to/pu/it.. i guess
rm the original log.txt file
********************************************************
Elaine April 22 12:42 PM
Amy fixed get sequential page to be replaced...inifinte
loop now.
********************************************************
Elaine April 22 11:16 AM
Ooohhhh! It seems to work now...maybe, at least for
testJoin4.c - more testing to be done...Echo works too!
I am happy! ^_^
Nevermind...there are some errors, either infinite loop
or assertion error... |