Abstract
This
study compared student learning outcomes and student perceptions of / and satisfaction
with the learning process between two sections of the same class course—an
online section and a traditional face to face section. Using a
quasi-experimental design, students were randomly assigned to the two course
sections. Conclusions suggest that the face to face encounter motivates
students to a higher degree and also provides students with another layer of
information concerning the instructor that is absent in the online course.
Introduction
Recent
research in online education has focused upon whether web based courses provide
students with the same degree of personalized learning and content mastery that
students experience in face-to-face (f2f) classes. Few studies, however,
utilized experimental design across several variables including student learning
as well as satisfaction with the learning experience.
Progress
and innovative use of technology in education has greatly improved the quality
of web based delivered courses (Schott, et. al., 2003). To determine whether
web based courses indeed provide students with a comparable if not more
superior learning experience, researchers over the past 5 years have conducted
a plethora of studies comparing aspects of the traditionally delivered
instruction with online instruction (Rivera, McAlister, & Rice, 2003 not in
ref list). Findings are mixed (in the Rivera study or in
other studies?), but the general consensus is that students learn
just as well using web based instruction, but are less satisfied with the
learning experience. Miller, Rainer, & Corley (2003) noted that the more
negative aspects of web based instruction include student procrastination,
poor attendance, and a the students’ sense
of isolation. Other studies noted that online courses are more effective with
particular personality types (Daughbenbaugh, et. al., 2002 check spelling—this is not
in ref list under this spelling). Few if any studies have utilized
random assignment to determine whether the “average” student might fare just as
well in an online course as in an f2f course. Rather than comparing two
potentially unequal groups, this study utilized random assignment in order to
compare equivalent groups thereby controlling for predispositions towards one
type of learning style over another. Big leap in this
paragraph--- need a transition between the bulk of the para
and the last 3 sentences-- Explain the
self-selection bias that is inherent—that will ease
the lead into the last 3 sentences.
As a LR, this is too brief—there are a lot of
issues and concerns (self-selection, why Universities encourage online learning,
various definitions of online learning, instructor
characteristics and involvement,
university support—financial and otherwise—are just
a few of those issues)--- a fuller discussion
of at least some of these issues would be helpful.
The
course in this study, Early Childhood
Education: Philosophy and Practice, is a beginning entry level survey course
required for early childhood majors who just entered their pre-professional
program. (200 level?) Currently there are more than 900 students
enrolled in the Bachelor of Education in Early Childhood Education program
which prepares students to teach children ages 3-8 with a variety of learning
styles including those at-risk, typically developing,
mild to moderately disabled and gifted. (reverse order of
sentences 1 and 2) The f2f sections of the course are scheduled to
meet twice weekly in seminar fashion, while the online
sections...... Content covered in the all sections of the course
ranges
frominclude ECE (define ECE) history,
theorists, curriculum, inclusive learning environments, designing and planning
themes, webbing to strategies, evaluation and parent involvement. Central to
the course is the development of reflective thinking and application to
reflective practice.
To
make both sections of the course “equivalent”, the instructor used duplicate
syllabi, revealing including duplicate assignment
requirements. Students in the web based section were required to attend at least
two “Live Chat” sessions per week which served to replace the discussion time
in the f2f section. Students in both sections were given equal credit for
attendance. All students in both sections were assigned to small groups for
in-class or online assignments.
Method
Often
students who enroll in web based courses have a predisposition towards this
means by which to learn. This issue threatens the validity of findings based
upon comparisons between web based and f2f courses. The groups, by nature of
learning preference and computer comfort levels, are not equivalent (fuller
discussion of this in the LR would be very helpful) and therefore
findings cannot be generalized beyond the restrictions of the studies. To
address this weakness, this study used a quasi-experimental design that infused
non-random selection of whom? with random assignment to the
control (f2f) and experimental (web based) groups. Prior to registration,
students were asked whether they would be amenable to allowing the department
to assign them to either the f2f or the web based section of the course. While
students volunteered to participate in the study, random assignment to the
groups strengthened the internal validity of the study and enhanced group
equivalency. (Were there
students in the f2f and online sections who were not part of the study? Folks
who specifically chose those particular sections?)
To
validate group equivalency, all students (in the study population?
all students in all sections?) completed the VARK (visual, aural, read/write,
kinesthetic)—a diagnostic instrument designed to determine learning preferences
(Copyright Version 4.1, 2002, held by Neil D.
Fleming, Christchurch, New Zealand and Charles C. Bonwell, Green Mountain
Falls, Colorado 80819 U.S.A. citation).
Using the VARK, students can beare classified
with mild, strong or very strong preferences in any of the four identified
learning styles. In addition, students can show multimodal
tendencies (more than one style appears to be preferred). For the purposes of
this study, students were classified in one of 5 categories—visual, aural,
read/write, kinesthetic, and multimodal learners.
To
control other confounding variables that might result from the delivery methods
of two sections of the course, the same instructor taught both sections during
the same semester. The instructor took care to compare the design and delivery
of both sections of the course to ensure that topics covered, work required,
testing, and the classroom experience were as closely matched as possible. The
syllabi of both courses were also compared by a colleague to provide content
validity.
So
this study is based on findings from 2 sections (our of how many sections?) of
this course. How many students were in
each section? Were all students in these
2 sections participants? If not, how
many non-participants were in each section?
In
order to provide an unbiased measure and comparison of student-teacher
interaction between groups, a modified interaction analysis instrument (IA)
based upon the work of
Category
Teacher talks Accepts
feelings
Praises
or encourages
Accepts
or used ideas of pupils
Asks
questions
Lectures
Gives
directions
Criticizes
Student talks Responds
Initiates
Silence/Confusion
Four
categories were added to “student talks”: “validation of others’ ideas”,
“praise or courtesy remarks”, “questions or asks for clarification”, and
“silence due to ‘down time’”. This last category was designed to earmark extra
time needed in a live chat online. Lengthy contributions in the chat room
require both longer time for typing as well as for reading. In this case,
“silence/confusion” is not an appropriate label for what is occurring. The
“down time” category was used only for the web based course and was not a
function of comparison between groups. It was verified by rereading logs of the
live chats. A better
explanation of the IA is needed--- these categories define what the teacher (or
student) are doing when talking? From
this very brief overview, it appears not to be “interactional”
but, instead, one-sided. This may be
another area to discuss in the LR-- a
discussion of analysis tools and the + and = of what they measure (and why a 35
yr old tool is being used in this context)
IA
scoring is measured by using anAn observer to listens
to the classroom interaction and check off the type of interaction taking place from
the list of categoriesidentifies the IA category
represented by that interaction.
Categories are identified . The observer marks a
category every 3 seconds. Frequencies of categories are then
tabulated and preferences or trends can be seen by comparing
categories within a sessionobserved. In
this study, comparisons were made between f2f and web based discussions to
determine whether the interaction experience between the groups varied. (The “observer” sat in on the web-based
discussion, viewing the chats in real time?)
Two
20-minute sessions were randomly selected and video-taped from all possible f2f
classroom discussions. Two corresponding web based chat room discussions were
also monitored for 20 minutes. The resulting frequencies were then compared
using a chi-square test of homogeneity to observe differences between multiple
variables with multiple categories.
The examination of student learning outcomes compared
group means of student test grades and overall grades using an independent
t-test. Test scores (as opposed to letter grades) were used with the assumption
that they reflected interval spacing meaning?. To measure student perceptions of
student-teacher interactions as well as satisfaction with the course as a
whole, identical end-of-semester evaluation were completed and an independent
sample t-test to compare mean evaluation scores for the groups was calculated.
Findings
Sample info: Of the total (100+) students who enrolled in all four
sections of the ECE: Philosophy and
Practices course, 42 agreed to participate in the study. The f2f course section
(control) had 24 students—3 males and 21 females—and the
web based course section had 18
students—1 male and 17 females. The unequal class sizes resulted when some
students either added or dropped the course at a late date after the assignment
control process was halted. All of the students in the f2f course were
considered traditional students in that they enrolled in college right out of
high school. There were two non-traditional students (returning for licensure)
enrolled in the web based course.
Statistical issues:
1. Should the non-trad students be included in this analysis? Need
support one way or the other
2. Question
asked earlier--- were there any students
who were in these 2 sections who were not part of the study?
3. Analysis
needed of the students who dropped the course after
selection. Need to
confirm that there wasn’t differential “dropping.” In other words, did the same types of study
participants drop both courses?
Group equivalency: The VARK survey of learning preferences was
completed by 18 students in the f2f group and 15 students in the web based
group. Why not all students? The distribution of learning
preferences for each group was equally distributed across the learning styles.
A chi square goodness of fit test was administered using the control group as
expected frequencies and the experimental group as the observed frequencies.
Results showed no statistically significant difference between group learning
preferences (χ2 = 3.36; df = 4;
α= 0.05). Therefore it was assumed that the groups were equivalent.
Interaction Analysis: Results of the chi square test of homogeneity revealed that a
statistically significant difference did indeed exist between the nature of teacher/student
interaction in the two groups (χ2 = 900.035; df=9; α= 0.05) confusing—was there 1 interactional score (as this implies) or several subscores?. An examination of the standardized
residuals revealed the interaction categories contributing to the differences.
Areas where the observed frequency was significantly higher (H) than expected
for the web-based course included student responds, student supports others in
class, student silence/confusion, and teacher accepts feelings. Lecturing was
the only area lower than expected. Not clear how/why
“lecturing” was included in the online course—how was “lecturing” defined
there? For the f2f course, student responds, student
asks questions, student initiates and idea were all higher than expected and silence/confusion
was lower. The instructor lectures was also higher than expected. Because the categories were imperfectly
defined/described earlier, it is difficult to understand this.
The
instructor spent less time lecturing in the chat room than in the classroom. In
a web based course, lecturing often takes the form of a web page and is not a
typical use of the chat room. On the other hand, the f2f classroom does not
allow for the clear-cut compartmentalization of lecture versus discussion.
Because only two samples from each group were observed, it is possible that
other f2f sessions may have shown less time spent lecturing. The general trend,
however, is that lecturing did not dominate the web based course discussions. It is also possible that is is not an appropriate measure to discuss at all and perhaps
should not be included here---- unless there is video component to the online
course that allows for more “traditional” lecturing. Otherwise, it appears that what is being
compared is apples and oranges. There
are enough other measures to consider.
The
instructor also allowed for more and longer periods of silence in the chat room
than in the classroom—most likely due to the expectant? nature of chat room discussions. The
instructor, without the aid of visual contact with the students, was unable to
determine whether students were simply thinking and formulating questions and
answers or whether they indeed had nothing to add. How long were the silences
in f2f vs online?
It was observed that a period of silence was followed by
several contributions from students popping on the screen almost
simultaneously. In an f2f setting, students and the instructor can tell exactly
the discussion floor is open. The chat room discussions smudge this demarcation
into fluctuations of silence and activity.
A
unique not a good use of “unique” difference
between the two groups was illustrated in the first web based session where
students showed support for one another to a higher degree than expected. Example?
evidence? Chat room discussions may put students and the
instructor on an even footing meaning? thereby encouraging students
to not only support one another but to take on a more empowered role in the
class discussion. Not clear what is meant by “support”
Student Evaluations: Students in both classes
completed identical course evaluations before their final exam. The evaluation
included items that explored student perceptions of both the instructor and the
course. Instructor items focused upon perceived teacher
effectiveness. Course items included those dealing with the general
organization, the value of the course as it related to their major area of
study, the textbooks, exams, and general assignment workload. All evaluations
were anonymous.
Students
in the f2f class rated the instructor and the course significantly higher than
those students in the web based course (with p = 0?? p < .001?). Mean scores for
the f2f and web based classes were 1.22 and 1.82 respectively on a 5 point
scale where a “1” indicated the highest ranking (outstanding) and a “5” the
lowest (poor). In both cases the instructor received very good scores; - yet
the students in the f2f course felt the quality of the instructor and the
course to be better. T tests
were then conducted on individual questions to locate where the classes
differed significantly. The alpha level was lowered to 0.01 to control for Type
I error (awkwardly phrased but good choice—why .01? was this calculated (based on # of
measures to be compared) or simply chosen?) and the analysis
revealed statistically significant differences on each of the 22 questions
suggesting that students collect extra information concerning an instructor
based upon direct observation. For example, in the web based course, students
have limited access to instructor interaction with other students. A student in
the web based class will ask about personal difficulties using private email.
However, it is common for students to ask questions of this type before, during,
and after an f2f class where other students may observe the exchange. It is
logical, therefore, that an instructor might receive a lower rating on an item
like offering assistance to students with problems connected to the course in a
web based course where this interaction is less evident. Which raises the question whether this is a
valid comparative measure to be included here.
Overall,
the web based students gave the instructor a high rating and the f2f students
gave a stellar rating. In neither case did the students indicate a negative
experience but rather a slightly less positive experience. Interesting
comparisons indicated that the students in the f2f course expected an average
grade of A- while those in the web based course expected a B-. As far as
grading the instructor, f2f students assigned an average grade of A and the web
based students assigned a grade of B+. 1. Where did the final grade data come
from? 2.
How did the instructor evaluate students? Was there some area of evaluation that could
not be adequately “proven” by online students?
(“Quieter” students who learn by listening and
maintaining an internal dialogue may not be “seen” online while the same
student may be “obvious” in a f2f class.
¶There
have been many studies conducted showing the high correlation between student
expected grade and student evaluation of the instructor (discuss in LR).
To determine whether students in one section of the course actually did perform
better than those in the other, exam grades and overall grades were compared.
Three
indicators of student success were examined—midterm examination, final
examination, and overall points earned for the semester (included other
assignments). Of the three comparisons, only the mean score for overall grade
differed at a significant level (p = 0.02). Students in the f2f course averaged
an A- and those in the web based course averaged a B. Students seemed to
predict their final grade with accuracy indicating that the grading process for
both sections was clear-cut. The main difference between test scores and
overall points earned for the semester were other assignments required
throughout the semester. A closer look at student records revealed that
students in the web based course did not earn lower grades on these assignments
but merely failed to submit some of them suggesting that learning outcomes were
similar but that the personal contact of an f2f course positively influenced
and motivated students to turn in assignments.
Conclusions/Recommendations
General
findings of this study showed that two equivalent groups, randomly assigned to
either an f2f or web based course, do not have equal experiences in the area of
student perceptions defined as?. Learning outcomes can be
considered to be equal based upon test scores. Because the instructor was the
same for both courses, it can be concluded that the course delivery (f2f
versus online) may have some effect on the variables examined rather
than instructor differences. (How much experience did the instructor have
teaching online sections?) The interaction analysis showed that the
instructor tended to lecture less in the web based course See comments earlier .
Because only two pairs of discussion sessions were scrutinized, findings in
other areas of interaction, and specially (?) student
interaction, may not generalize. Student evaluations of the course and the
instructor also differed. Students in the web based course tended to rate both
the course and the instructor lower than students in the f2f course—although
ratings for both groups were considered to be above average. Finally, student
achievement differed only in the area of course assignments. Test scores showed
no statistically significant difference indicating that student mastery levels
were essentially the same; , yet students
in the web based course were more likely to omit submitting one or more
assignments (did these missed assignments cause their lower
grades?). So, students in the web base course may be less
motivated to complete assignments. because they are not
counted towards the final grade?
Limitations
of this study include a small sample size and a restricted population. How might
that have impacted the results? Future
research might apply this model to other content areas and explore the specific
differences in course delivery methods that account for student perceptions.
Some of the differences between the f2f and web based courses in this study
were due to the random assignment of students to the groups. Students who may
not be familiar or comfortable with web based courses were in the experimental
group. Isn’t that exactly one of the goals of the
study? This should not be discussed in a
para on limitations.
Their perceptions and experiences, therefore, were more
indicative of that of the “average” student as opposed to those students who
generally enroll in web based courses.
References
1. Article
titles are not surrounded by quotation marks
2. Review correct “retrieval
statement” format
3. All references must be cited, all
citations must be referenced.
Cohen, First Initial?
(1962). The statistical power of abnormal-social psychological research: A
review. Journal of Abnormal and Social
Psychology, 65, 145-153. not cited
Daughenbaugh, R., Daughenbaugh, L.,
Surry, D., & Islam, M. (2002). “Personality type and online
versus in-class course satisfaction.” Educause Quarterly, 25 (3), 71-72. not
cited
Miller, M.D., Rainer Jr.,
P.K., & Corley, J.K (2003). “Predictors
of engagement and participation in an on-line course.” Online
Journal of Distance Learning Administration, 6 (1). Retrieved from http://www.westga.edu/%7Edistance/ojdla/summer62/schott62.html,
December 12, 2003.
Rivera, J.C., McAlister, M.K., & Rice,
M.L. (2002). “Comparison of Student Outcomes &
Satisfaction Between Traditional & Web Based Course Offerings.”
Online Journal
of Distance Learning Education Administration, 5 (3).
Retrieved from http://www.westga.edu/%7Edistance/ojdla/fall53/fall53.html
, December 3, 2003. not cited
Rivera, McAlister, & Rice (2003) ??
Schott, M., Chernish, W., Dooley, K.E., & Lindar, J.R. (2003). “Innovations
in distance learning program development and delivery.” Online Journal of Distance Learning
Administration, 5 (2). Retrieved from http://www.westga.edu/%7Edistance/ojdla/summer62/schott62.html,
September 9, 2003.
|
Rating Table |
||||
|
Quality Statements |
Strongly |
Agree |
Disagree |
Strongly |
|
A: The manuscript deals with a significant problem. |
1 |
2 |
3 |
4 |
|
B: The manuscript is creative or deals with the |
1 |
2 |
3 |
4 |
|
C: The author included the appropriate background |
1 |
2 |
3 |
4 |
|
D: The author's writing style is appropriate, |
1 |
2 |
3 |
4 |
|
E: The study is conceptually based and theoretically |
1 |
2 |
3 |
4 |
|
F: The analyses are sound and appropriate. |
1 |
2 |
3 |
4 |
|
G: The conclusions and/or policy implications flow
|
1 |
2 |
3 |
4 |
|
H: Readers of AEQ will find this article
|
1 |
2 |
3 |
4 |
|
COMMENTS: See Editor’s Notes,
above and below |
||||
|
REVIEWER'S
NAME: Do not sign your name. Instead, write your
3-letter identity. |
||||
This manuscript asks a very important question in higher
education: how viable are online courses for the typical (non-self-selected)
student? As the author suggests, very few
studies have truly addressed this question.
Thus, this work is important in that it is an initial attempt to address
the concerns and the realities of online courses in a face-to-face environment.
The manuscript shows promise but there are a number of
unresolved issues that need to be more clearly addressed before
acceptance. Thus, I would recommend that
the work not be accepted now but that the author be asked to resolve those
difficulties and resubmit in the very near future.
There are two major areas to address:
1. Statistical:
a. An analysis of “group leavers” would
resolve some of these concerns.
b. It is not clear if the 2 groups include members who are not
participants in the study. If the two
groups include different numbers of “non-participants,” this could bias the
results. This needs to be addressed.
c. Because the author performed multiple
analyses, (s)he lowered the significance from .05 to .01. Not clear whether this was based on number of
post hoc analyses being performed or whether this was an arbitrary choice.
2. Lit Review
a. The LR was suggestive but not
complete. The author raised a number of
concerns regarding online vs f2f teaching in the body
of the text that should have been clarified in the LR.
b. Choice of analysis instrument was not
clear—A specific discussion of methodological tools was warranted in the LR
3. Methods/Results
a. Better description of the specific
measures on the IA is needed. Some
seemed superfluous to this study: why were they included?
b. Richer discussion of the value of some
the findings
All in all, this work discusses an important topic that
needs a fuller airing in the academic community. I encourage the author to rework this paper
and resubmit.